Sb energy1

From CHARMM tutorial
Jump to navigationJump to search

A brief introduction to force fields

This section of the tutorial deals with how CHARMM calculates the potential energy of a molecular system and its first derivative (the force). CHARMM supports methods for evaluating molecular systems at varying levels of theory; this section introduces the purely classical CHARMM force field (which is distinct from the program itself). CHARMM the program includes support for several other force fields as well as mechanisms to perform ab initio and semiempirical quantum energies and for coarse-grained modeling. These topics are beyond the scope of this tutorial. In computational chemistry, a force field consists of two things: (1) a functional form defining how to compute the energies and forces on each particle of the system, and (2) a set of parameters that define how positional relationships between atoms determine their energy (e.g. how the bonded potential energy changes as a bind stretches or contracts, that is, as the distance between two bonded atoms increases or decreases).

A full treatment of force fields and their development is beyond the scope of this tutorial. Interested readers are encouraged to consult the Wikipedia entry on force fields and the molecular modeling references given in the introduction.

Extended background on calculating energies and forces with CHARMM

Energy calculation

The force field terms

The molecular mechanics force field used by CHARMM and similar computer programs is a simplified, parametrized representation of reality. Interactions between chemically bonded nearest neighbors are handled by special terms, the so-called bonded energy terms: e.g., bond stretching, angle bending, dihedral and improper dihedral energy terms. Interactions beyond (chemically bonded) nearest neighbors are represented by the two so-called non-bonded energy terms (Lennard-Jones (LJ) and Coulomb interactions). Very early versions of the force-field contained a special term for hydrogen bonds; the code is still in CHARMM, but it is, in general, not used anymore.

Thus, the energy function of the latest CHARMM "param22"/"param27" force field has the following form (J. Phys. Chem. B, 1998, 102, 3586; J. Comput. Chem., 2004, 25, 1400; J. Comput. Chem., 2000, 21, 86, ibid. 105ff):

where consists of the following terms,






, and

consists of two terms,




The values of the naught terms above (e.g. in , in etc.), the various force constants (, etc.), as well as partial charges and LJ parameters (, ) are taken from the force field parameters. [SB: do we go into mixing rules for LJ?]

Both non-bonded terms are normally modulated by a shifting or switching function (however, periodic bondary boundary conditions may be used in place of this modulation, particularly for electrostatics), see below.

The above terms are the standard terms of molecular mechanics force fields just described, with two exceptions that require some explanation. The first is the so-called "Urey-Bradley" () term in addition to the standard bond stretching and angle bending term . It is a harmonic term in the distance between atoms 1 and 3 of (some) of the angle terms and was introduced on a case by case basis during the final optimization of vibrational spectra. The UB term turned out to be important "for the in-plane deformations as well as separating symmetric and asymmetric bond stretching modes (e.g., in aliphatic molecules)" (J. Phys. Chem. B 1998, 102, 3586).

A more recent addition to the CHARMM force field is the use of the CMAP procedure to treat the conformational properties of the protein backbone (J. Comput. Chem. 2004, 25, 1400). The so-call CMAP term is a cross-term for the (backbone dihedral angle) values, realized by grid based energy correction maps. It can, in principle, be applied to any pair of dihedral angles, but is used in the current CHARMM force field to improve the conformational properties of the protein backbone. Several programs claiming to be able to use the CHARMM force field(s) lack support for this term.

We note in passing that CHARMM can compute energies and forces with several non-CHARMM forcefields, including converted versions of the AMBER and OPLS-AA forcefield, but also more generic force fields, such as the Merck force field. [TODO: add refs] Use of these force fields is beyond the scope of this tutorial.

Calculation of non-bonded energies: cut-offs, shifting, switching etc.

The most time consuming part of an energy or force computation is the evaluation of the LJ and electrostatic interactions between (almost) all pairs of particles.

While straightforward in principle (and in practice energy/force calculation in CHARMM is as simple as


issuing a single command), numerous options influence the calculation of non-bonded interactions, and so it is very important that you understand their meaning and have the necessary background to understand the rationale for them in the first place.

To compute the interaction of N atoms with each other, one needs in principle NN steps. Bonded interactions only involve next neighbors, so they are cheap in terms of computer time. For the non-bonded energy (LJ, electrostatic) it is customary to introduce a cut-off beyond which interactions are ignored. This is a reasonable approximation for the van der Waals (= LJ) interactions, which decay rapidly for large distances, but a bad one for electrostatic interactions which go to zero as . For periodic systems (the default situation found when simulating solvated proteins), the so-called Ewald summation method (usually employed in its fast incarnation, particle-mesh-Ewald (PME)) circumvents this particular difficulty; nevertheless, cut-offs are ever present in molecular mechanics energy/force calculations and their implications need to be understood.

Nonbonded exclusions: Before continuing on the subject of cut-offs, we need to introduce the concept of nonbonded exclusions. To avoid computing interactions between chemically bonded atoms (1-2, 1-3 and, possibly, 1-4 neighbors) with multiple different force field terms (specific bonded terms, as well as the "unspecific" LJ and electrostatic term), these pairs are excluded from the nonbonded energy calculation. When using the CHARMM force fields, one never calculates nonbonded interactions between 1-2 and 1-3 neighbors. LJ and electrostatic interactions between 1-4 neighbors are calculated, but can be modified compared to more distant pairs (different LJ interactions, scaling of electrostatic interactions). The details depend on the specific force field used. E.g., in the older polar hydrogen force field ("param19"), electrostatic interactions and forces between 1-4 neighbors were scaled by 0.4, whereas in the current all-atom force field(s) ("param22", "param27"), electrostatic energies and forces are used as is. This scaling factor is controlled by one of the many options to the ENERgy command, E14Fac. Thus, for calculations with the old, polar-hydrogen "param19" force field, E14Fac should be set to 0.4; for calculations with the current "param22/27" force field it should be set to 1.0. IMPORTANT HINT: E14Fac is an example of a user settable option which should never be set by a normal user. When you read in the respective parameter file, the correct default value is automatically set. In fact, many of the options to the ENERgy command are of this type --- you should understand these, but, again, you should never change them (unless you are an expert and develop a new force field etc. in which case you won't need this tutorial!). We'll return to the issue of specifying all possible energy options as opposed to relying on defaults later on; unfortunately, this is less clear-cut, as it ought to be.

Back to cut-offs: The CHARMM keyword to set the cut-off for electrostatic and LJ interactions is CTOFnb. Thus, the statement

ENERgy CTOFnb 12.0 ! not recommended in practice

discards LJ and electrostatic interactions between all pairs of particles that are > 12.0 Angstroms apart (of course any nonbonded exclusions are always honored!). The straightforward application of a cut-off criterion as in the above (bad) example introduces a discontinuity that can adversely affect the stability of a molecular dynamics calculation. Consider a pair of atoms that are separated by approximately the cut-off distance. At one time-step they interact and contribute to the energy and forces in the system. Then they move relative to each other by a tiny amount and suddenly they are more than the cut-off distance apart. Although very little has changed, they now do not contribute to the energy and forces at all.

To avoid these abrupt changes, the LJ and electrostatic energy functions in CHARMM are (always) modulated by switching or shifting functions (it actually takes some tricks to coax CHARMM into doing effectively a plain cut-off for you!). For the full details, in particular the detailed functional forms of switching and shifting functions, see J. Comput. Chem. 1983, 4, 187; Proteins 1989, 6, 32; J. Comput. Chem. 1994, 15, 667. The effect of a switching function on a LJ interaction is illustrated by the following figure.

Graphical view of cutoffs

At CTOFnb 12., the interaction (and its derivative, i.e., the forces) has to be zero as shown. Up to an inner cut-off (CTONnb), interactions between the two particles are normal. Between CTONnb and CTOFnb, the normal interaction is switched off (hence the name) by a sigmoidal function. At distances greater than CTOFnb, interactions and forces are zero. The effect of a switching function is shown for two different values of the inner cut-off, 8 Ang (red) and 10 Ang (green).

A shifting function also modulates the normal interaction between two particles. The design goal is that the original interaction is modified so that it approaches zero continuously at the cut-off distance (CTOFnb); in addition, forces are required to vanish continuously as well at the cut-off distance. Visually, it looks as if the original interaction were "shifted" so that it became zero at the cut-off distance. The actual function is more complicated because of the requirement of continuity of potential and forces at the cut-off distance!! With shifting functions, only CTOFnb is meaningful (as the cut-off radius), whereas an inner cut-off (CTONnb) is ignored. Switching and shifting can be activated separately for LJ and electrostatic interactions; i.e., the keywords VSHIft and VSWItch control whether a shifting or switching function is applied to LJ interactions; similarly, SHIFt and SWITch choose between shifting and switching function for electrostatic interactions. To illustrate this by a realistic example, let's take a look at the options that correspond to the default force field choices for the current CHARMM all-atom force field (param22/27)

ENERgy SHIFt VSWItch CTOFnb 12. CTONnb 10.

The line indicates that electrostatic options are shifted to zero using a cut-off value of 12 Ang. (CTOFnb) and that LJ interactions are modulated by a switching function between 10 (CTONnb) and 12 Ang (CTOFnb). Any interactions beyond 12 Ang are discarded. Note that the CTONnb parameter is ignored for the shifted electrostatic interactions.

Beyond SHIFt/SWITch: In molecular dynamics, the determining factor are the forces, not the potential. Thus, it has been suggested to apply shifting/switching functions to the forces, rather than the potential. Such methods are available in CHARMM as well, driven by keywords VFSWitch (for LJ) and FSWItch / FHIFt (for electrostatics). For the gory details see nbonds.doc of the CHARMM documentation. The recently introduced, so-called IPS method provides yet another approach to handle cut-off in a smooth, continuous fashion. However, all these options go beyond the scope of this tutorial.

GROUp vs. ATOM: A cut-off criterion can actually be applied in two ways. The default is to apply it on an atom by atom basis (keywords ATOM, VATOm, the option with V in front pertains to the van der Waals (=LJ) energy computation, that without to the electrostatic energy computation). In addition, there is the possibility of a group based cut-off (keywords GROUp, VGROup). In the CHARMM force field topologies, the partial charges of molecules / residues are grouped into small groups, that are usually electrically neutral, or for charged residues, carry the net charge of +-1, +-2 etc. When GROUp/VGROup is set CHARMM first computes the center of geometry for each of these charge groups. If the distance between the centers of geometry for a pair of groups are separated by more than the cut-off distance (CTOFnb), then no interaction between all atoms of the two groups is computed. By contrast, in the atom by atom case, some pairs of atoms might be inside and others might be outside the cut-off distance. Thus, in general the two options give different results. (Note: you cannot mix group and atom based approaches; i.e., the combinations ATOM/VGRO and GROU/VATO are illegal!) There are some theoretical arguments that would indicate that the group by group based cut-off criterion ought to be preferred. This is particularly the case if all groups have an overall charge of zero, which is the case, e.g., for water. Contrary to this, it has been shown for the CHARMM parameters that the group by group based cut-off criterion gives inferior results; in addition, it is much slower, explaining that ATOM/VATOm is the CHARMM default.

Some remaining options: Having said this, we can take a look at the full default options of the current all-atom force field

CUTNb 14.0 CTOFnb 12.0 CTONnb 10.0 EPS 1.0 E14Fac 1.0 WMIN 1.5

The strings which you don't know yet are NBXMod 5 CDIEl VDIStance CUTNB 14.0 EPS 1.0 WMIN 1.5. The only important one is CUTNB since this is a value you may indeed want to change. It is the cut-off radius used to generate the nonbonded pair list and is discussed in detail in the next section. NBXMod 5 controls the handling of nonbonded exclusions (cf. above), and setting it to 5 means skipping 1-2, 1-3 interactions and treating 1-4 interactions specially. The WMIN distance is used to print out interaction pairs that are closer than 1.5 A. At shorter distances, very high LJ repulsion would result, and its useful to be warned since such pairs could cause problems in MD calculations. CDIEl and EPS are two related options, which nowadays are a left-over from a distant past and should be left alone. A brief explanation: The solvent (water) is a very important determinant when studying the properties of biomolecules. When computers were slower, one often tried to avoid simulations in explicit solvent, and the two parameters can be used to mimic the influence of solvent. Water has a high dielectric constant (DC), which screens electrostatic interactions. To mimic the presence of water in gas phase simulations, CHARMM allows the user to change the value of the DC by the EPS keyword (default value is 1), i.e., the electrostatic energy is computed according to . The CDIEl / RDIEl option is another attempt to mimic water in gas phase simulations. CDIE stands for constant (uniform) dielectric, RDIE means a distance-dependent dielectric, , where on the right hand side means the number set by EPS. Effectively, when you use RDIE, you compute a modified electrostatic energy according to . Nowadays when using explicit solvent, you should always use CDIE and leave EPS set to 1 (i.e., leave them at the default values!). Should the use of explicit solvent be too expensive computationally, CHARMM nowadays offers several implicit solvent models. Finally, VDIStance is (must be) a remnant of some long lost CHARMM option; the author (S.B.) honestly acknowledges not to know what it was supposed to do. Rest assured, it doesn't do any harm (it's parsed in the code and never used after that).

A final remark on ENERgy options: So, what nonbonded options should you set? In fact, if you read in the default param22/27 parameter file (e.g, par_all27_prot_na.prm), the options just discussed get set for you automatically. At first glance, there is no need to set any of these options explicitly; unfortunately, however, this turns out not to be true. This does not mean that you should modify the default options "just for fun". Remember, changing some values (e.g. CDIE or EPS, or replacing (V)ATOM by (V)GROUP , or changing E14Fac) is outright dangerous/false. You may want, e.g., to use some of the force based shifting/switching methods, but you should be experienced enough to understand thoroughly what you are doing and why you are doing it since the parameters were developed with the options shown above. In practice, what you'll want to change/add is to request the use of Ewald summation (PME) for solvated systems (see below); some PME related options need to be adapted depending on your system size. Similarly, for performance reasons you may want to choose a different nonbonded list cutoff CUTNB (but we need to understand more about nonbonded lists first). Normally, you would not want to change the cut-off radii CTOFnb, CTONnb, since they are part of the parameterization. Unfortunately, this is where it becomes tricky: Suppose you work with the param22/27 parameters, and, thus, correct default nonbonded options have been set. You decide (for reasons that will become clear in the next subsection) to increase CUTNB from 14 to 16 Ang. Thus, you specify

ENERgy CUTNB 16. ! WARNING: may not do what you want!

and assume that everything else is left alone. Unfortunately, for the above CHARMM command another default mechanism kicks in. If you change CUTNB, but do not set CTOFnb / CTONnb explicitly, the latter get changed according to CTOFnb = CUTNB - 2. and CTONnb = CTOFnb -2., hence you would suddenly be calculating with CUTNB 16. CTOFnb 14. CTONnb 12., which likely is not what you had in mind. It is, therefore, a bit dangerous to rely on defaults set by the parameter file. Although unsatisfactory, we therefore recommend to set the nonbonded options explicitly before doing any real work. The respective energy line should be identical to the defaults set in the parameter file, with the exception of individual parameters you might want to change, such as a modified CUTNB or replacing SHIFt by FSHIft.

In addition, there is one case were you have to change CTOFnb (and hence CTONnb): If you simulate small periodic systems, the minimum image criterion dictates that the cut-off radius must be smaller or equal to half the box-length (cubic PBC). For CTOFnb 12., this means that your cubic periodic box should have a side length of at least 24 Ang. Thus, for smaller systems, CTOFnb (and, hence, CTONnb) have to be reduced. Interestingly, some known workers in the field ((notably W. van Gunsteren and his group) find this such a bad choice (viz. reducing the cut-off radius below the force field default), so that the smallest boxes they use always have side lengths of twice the default cut-off radius of their force field. (Again, for the CHARMM all atom force field this would mean no boxes smaller than 24 A!)

Nonbonded lists

Background on nonbonded lists

In molecular mechanical energy/force calculations a cut-off criterion is normally used to limit the number of nonbonded interactions. Energy minimizations and especially molecular dynamics simulations require hundreds and thousands (if not millions) of energy/force calculations. To avoid checking for excluded pairs (1-2, 1-3 and, possibly, 1-4 neighbors) and, much more importantly, the cut-off criterion during every single energy calculation, lists are generated that contain all pairs for which the nonbonded interactions really have to be computed. This is an unnecessary overhead for a single calculation, but saves a lot of time during repeated computations.

In fact, any energy calculation in CHARMM consists really of two steps: (1) Calculation of the nonbonded list, (2) calculation of energy and forces using this nonbonded list. The time saving in repeated energy/force calculations (e.g., minimization, MD) comes from the fact that the nonbonded list is not generated for every energy calculation, but instead the list is kept for several consecutive steps (energy calculations). Now suppose the nonbonded list were generated with the cut-off radius used in the energy calculation (CTOFnb). The forces calculated in the first energy calculation are used to generate new particle positions (regardless whether we are doing minimization or MD). Because of the movement of the atoms, the original nonbonded list may not be correct anymore, i.e., it may contain pairs that are separated by more than CTOFnb and, worse, there may now be pairs less than CTOFnb apart for which no interactions are computed since they are not part of the list. For this reason, the nonbonded list should always be generated with a cut-off criterion/radius larger than CTOFnb, and for this reason CHARMM provides the separate CUTNB parameter used exclusively in list generation. With CUTNB > CTOFnb (by at least 1-2 Ang), we can be certain that the nonbonded list is valid at least for a certain number of steps. This now explains the hierarchy of cut-off options in the param22/27 ENERgy example above, where we had

ENERgy ... CUTNB 14. CTOFnb 12. CTONnb 10. 

CUTNB controls the list generation, whereas CTOFnb and CTONnb are used for the energy calculation (CTONnb only for those interactions modulated by a switching function). Graphically, this can be illustrated as in the following scheme (where CUTNB 16. instead of 14. was chosen):

Graphical view of cutoffs

The difference between CUTNB and CTOFnb, sometimes referred to as "skin", governs the frequency with which the nonbonded list needs to be updated. Provided the update frequency is appropriate, the choice of CUTNB is a true choice for the user, which should not affect the results, but which may affect performance. A larger skin means more work for a single non-bond list generation, but the possibility of a larger update frequency. The respective performance from smaller / larger skins depend on the computer hardware, the system size and whether the computation is carried out in parallel or not. No general rule can be given, but frequently larger skins (e.g., 4 Ang) perform somewhat better than the default skin size of 2 Ang.; for large scale production calculations it definitely is worth to experiment and benchmark to find an optimal value for CUTNB.

So, how often is the list updated / should one update the list? In CHARMM, this can be done by hand, or one can leave the decision to CHARMM. The keyword is INBFrq, and INBFrq n, with n a positive integer tells CHARMM to update the nonbonded list every n energy calculations. The correct choice of n is left to the user and depends on the size of the skin (cf. above).

Alternatively, INBFrq -1 (or any negative integer) tells CHARMM to watch the particle positions and update the nonbonded list automatically. The algorithm is guaranteed to err on the side of safety, so with INBFrq -1 one should always have a correct nonbonded list. For quite some time, INBFrq -1 (= heuristic update) has been the default (the old default of +50 was horribly inadequate and almost guaranteed wrong results!); nevertheless, this is one parameter that there is no harm in setting explicitly.

Finally, as is so often the case, CHARMM offers more than one algorithm to construct nonbonded lists; all of them, however, deliver a list in the same format (so for the energy routines it is irrelevant how the nonbonded list was generated!). Only two will be mentioned here. The old, default method is labelled BYGRoup, and BYGR is the (optional) keyword that chooses this method (again, it's the default). When performance is needed (large systems, running on a parallel machine), then one should use BYCB. It supports normal MD of solvated systems under periodic boundary conditions, but cannot be used for some more advanced usage scenarios. Also, BYGR can generate nonbonded lists suitable for group and atom based cut-offs; BYCB only supports atom based cutoffs.

What the ENERgy command really does --- or how to break ENERgy into its individual steps

From the above section, it should be clear that the ENERgy command is actually doing more than computing the energy since it also takes care of computing the nonbonded list. In fact, it is relatively smart and computes a nonbonded list if none exists (or if it is deemed outdated), but uses an up to date nonbonded list, should one be available. Thus, it is a wrapper, similarly to the minimization (MINI) and molecular dynamics (DYNA) (family of) commands.

If you look again at the options we have discussed so far, then they fall into two groups, (1) control of details of the energy calculation (e.g., CTOFnb), and (2) control of the nonbonded list generation (CUTNB, INBFRQ, BYGR/BYCB). If you go beyond a single point energy calculation (minimization, MD), then you have a third class of options controlling details of the minimization or MD.

The various steps hiding underneath the ENERgy command can actually be broken up; we show this for pedagogical purposes; also, the availability of the NBONds, UPDAte and GETE commands is occasionally useful in practice as well. NBONds parses and stores the energy related options, as well as list generation options. The UPDAte command generates the actual list (optionally, one could also here specify list generation and energy options). Finally, GETEnergy just computes the energy, the existence of a valid nonbonded list is assumed (otherwise your CHARMM job will crash). First, the standard way using ENERgy

! read RTF file
! read param file
! read PSF
! read coordinates

! compute energy of system, explicitly specifying the (default)
! options
CUTNb 14.0 CTOFnb 12.0 CTONnb 10.0 EPS 1.0 E14Fac 1.0 WMIN 1.5

Instead, the same can also be accomplished in steps:

! read RTF file
! read param file
! read PSF
! read coordinates

! parse nonbonded options
CTOFnb 12.0 CTONnb 10.0 EPS 1.0 E14Fac 1.0 WMIN 1.5 -
INBFrq -1 CUTNb 14. ! list options

! do generate list

! calculate "just" the energy 

[SB: Probably should show this explicitly with crambin]

As mentioned, the MINImiziation and (molecular) DYNAmics commands understand all options controlling nonbonded list generation and nonbonded energy / force calculation, as well as options specific to the task (minimization, MD). In particular for DYNA, the cumulative number of options (list generation, energy calculation, control of MD) can easily amount to 30-50. Such a command line can be become quickly very confusing. Given that list generation and energy options are remembered, it is therefore recommended to split the setting of options as shown below:

! PSF and full set of coordinates have been read
! optionally, periodic boundary conditions (see below) have been set up
ENERgy <list options> <energy options>
DYNA <only dynamics specific options>

Instead of ENERgy, one could also use NBONds/UPDAte. [SB: Example scripts in should reflect this recommendation]

Periodic boundary conditions (PBC): IMAGe/CRYStal

The textbook view of PBCs

We trust that you have read some background material on molecular mechanics and molecular dynamics simulations, and, thus, expect you to have some familiarity with periodic boundary conditions (PBC). The following is intended as a refresher with emphasis on pointing out where things may not so obvious for macromolecular systems.

Even with today's computers, the system sizes we can handle are microscopically small. Thus, without any special precautions we would be simulating "infinitesimal droplets", the properties of which would be dominated by surface effects. It is standard practice to work around this by employing periodic boundary conditions (PBC), i.e., by assuming that the central simulation system is surrounded in all spatial directions by exact copies (this assumes, of course, that you have a space-filling geometry; obviously, you can't have PBC for a spherical system. Anticipating CHARMM terminology, we refer to these surrounding boxes as images (image boxes). Unless otherwise stated, we'll assume a cubic simulation box (note, though, that CHARMM can handle any valid crystallographic space group for PBC!)

The role of the surrounding boxes (images) is to replace particles that leave the central simulation box. Consider a minimization or MD simulation, when the positions were just updated. You know the basic picture: as a particle leaves the central (cubic) box, say, to the right, it is replaced by the corresponding particle from the image of the central box on the left. Thus, working with PBCs entails making sure that your particle coordinates always fall within the central simulation box. This can be accomplished with computer code along the lines (the center of the box is supposed to be at the origin!)

if (periodicx) then
  if (x <  -xsize/2.0) x=x+xsize
  if (x >=  xsize/2.0) x=x-xsize

where xsize is the length of the box in x-direction. Analogous statements apply to the y and z coordinate, and for a cube xsize = ysize = zsize=L

There is, however, a second, more important component to PBC. Consider the interaction (LJ and/or electrostatic) between two particles i and j. If the replicas of the central box were real, we would have interactions between i and j in the central box, but also of i with j in all of the (infinite) number of images of the central box. This is, of course, not what we want to have; instead, we choose the interaction between i and j having the shortest distance. This may be the interaction between i and j in the central box, but it may also be the interaction between i and a particular image atom of j. In pseudo-code this so-called minimum image convention can be expressed as

if (periodicx) then
  dx = x(j) - x(i)
  if (dx >   xsize/2.0) dx = dx - xsize
  if (dx <= -xsize/2.0) dx = dx + xsize

where x(i) and x(j) are the x coordinates of i and j, respectively, and analogous statements for y and z. If you consider the above realization of the minimum image convention carefully, you'll note that the longest possible inter-particle distance is half the box-length, L/2. Thus, if you used a cut-off radius greater than L/2, your effective cut-off radius would still be L/2. As we shall see, CHARMM handles PBC rather differently, and for the minimum image convention to be obeyed the cutoff radius must not be larger than L/2, i.e., it is the user's responsibility to shorten CTOFnb if necessary!

Again, we assume that you have heard all this before. The problem with the above illustrative examples of PBC/minimum image convention is that the code assumes a collection of atoms. Consider, e.g., a box of waters or a box containing a protein surrounded by water. Suppose one water edges out of the box, i.e., the oxygen and one hydrogen are still within the central box, but the second hydrogen is out of the box. Then naively applying periodic boundary conditions will tear the molecule apart, i.e., one has to make sure that a water is shifted periodically only as a complete molecule. Similar considerations apply of course to the protein, or any molecule present in the system. Also, the PBC/minimum image checks have to be carried out for each energy calculation. For cubic boxes, this is not too bad, and by optimizing the above code snippets, the overhead is small. However, for more complex periodic boxes (truncated octahedron, rhombic dodecahedron), this may quickly become a nuisance.

How CHARMM does PBCs

Thus, CHARMM does things differently than most textbooks (the notable exception being Rapaports' Art of Molecular Dynamics Simulation, where the general approach taken by CHARMM is outlined), and, actually, differently than most comparable programs. Here we describe the underlying algorithmic approach, whereas in the next subsection we describe how things work from the user perspective and what pitfalls one has to be aware of.

CHARMM takes periodic images "seriously", i.e., (a subset of) the atoms of the periodic images surrounding the central box are generated. (The name of the routine is MKIMAT (that's a subroutine name, not a CHARMM command!), keep it in mind, we'll have to comment on this routine somewhat more later on). First consequence: the actual number of atoms in your system with PBC is (much) larger than the number of atoms in your central box. (This apparent "waste of memory" is put to good use for the sake of generality and also performance.) Contrast this with the textbook approach, the image boxes are not really present, as an image atom "enters", a "real" atom "leaves/vanishes". The number of image atoms is kept as low as possible by two factors. First, one does not have to generate the atoms from all image cells. In 3D a central cubic box is surrounded by 26 image. Of these, only, e.g, those to the right, the top, and the back of the central cubic system are needed, reducing this number to 7. Second, for large boxes (box lengths of solvated protein systems are easily 60 - 100 Angstroms!), one does not need to replicate the full box, instead, one needs only those image atoms which are within the cutoff radius, or more correctly, the cutoff radius used to generate the nonbonded list(s).

A periodic simulation system in CHARMM consists of the central box (e.g., a cube), and a (partial) layer of image atoms with width CUTNB (or actually CUTNB + some safety distance). Now, one generates two nonbonded list, one for atoms in the central box ("primary atoms"), the second between primary atoms and image atoms. Nonbonded interactions are computed twice, once with the primary-primary list, and once with the primary-image list. In the CHARMM output you'll find the energies listed separately, i.e., the total LJ and electrostatic energy of a system is now the sum of two terms each. You'll probably have to draw some diagrams (preferably in 2D) to convince yourself that this works. There is one border line case, and that is the case of small boxes. Here you have to ensure that the cutoff radius for the energy calculation (CTOFnb) is less than half the box length. If this is the case, even if for a pair of particles i, j, entries exist in both the primary-primary and primary-image lists, at any given moment only one distance can be lower than half the box-length. By choosing a short enough cut-off, you ensure that the minimum image convention is implicitly obeyed. Unfortunately, CHARMM provides no warning if you overlook such a case, and this is one of the pitfalls lurking when using PBC in CHARMM. More in the next subsection.

Use of CRYStal / IMAGe in its simplest form

Before dwelling on pitfalls, let's look at the practical aspects of setting up PBC. The user interface for setting up PBC in CHARMM is provided by two modules, IMAGe and CRYStal. IMAGE and CRYSTAL provide similar capabilities and complement each other. One may also view CRYSTAL as an interface to make IMAGE more user-friendly, and this is the way it is usually employed nowadays. [SB: should try to be more precise here]

Before trying the following snippets, You should have read in RTF, parameters, PSF and coordinates for your system. Also, assuming a protein / water system, we assume that you have DEFIned two atom selections, protein containing all your protein atoms, and water, containing all your TIP3 waters. Finally, let's assume that you have a box length of 60 Ang., and you are using / planning to use the default cutoff parameters (CUTNb 14. CTOFnb 12. CTONnb 10.) Then, the following four lines set up PBC for a cubic box:

CRYSTal DEFI CUBI 60. 60. 60. 90. 90. 90. 
IMAGE BYREsidue XCEN 0.0 YCEN 0.0 ZCEN 0.0 SELE water END
IMAGE BYSEgment XCEN 0.0 YCEN 0.0 ZCEN 0.0 SELE protein END

The first line defines the crystal symmetry (CUBIc in our example), and gives the necessary information, side lengths A, B, C and the three angles , , , which in our case are A = B = C = 60 Ang, and , respectively. The generic form of the command would be

CRYS BUIL DEFI <type> A B C   

(It is particularly easy to build rhombic dodecahedrons or truncated octahedrons, which are often preferred over cubic simulation boxes.)

The second line initiates the building of image atoms. First, it would permit to augment the first line by additional symmetry operations (NOPE option, number of operations). Since CHARMM knows about cubic boxes, no further information is needed, hence NOPE 0. More important is the CUTOff parameter, which actually indicates how deep to construct the layer of image atoms. In order to work with the nonbonded list approach of CHARMM, the variable XO (as we call it here, any variable name is fine, alternatively, you can give the number directly) has to be as large as CUTNB (list generation cutoff!). Often, one augments CUTNB by 1 or 2 Angstroms; but keep in mind that any Angstroms may add thousands of image atoms to your system!

The meaning of the third and fourth line ("raw" IMAGe commands for a change) become clear when one considers the CHARMM way of handling PBC in the course of a minimization or MD simulation when the coordinates of all particles change continuously. Eventually, particles in the primary box will drift out of the box, and atoms in the image layer will enter the primary region (or "diffuse" further away). This is no need for immediate concern, since the skin in our nonbonded lists gives us a safety net (just as in the absence of PBC). Eventually, however, the central box and the image layer will have to be rebuilt (by subroutine MKIMAT). The obvious time to do so is when the nonbonded lists are recomputed. At this point all atoms are essentially subjected to a PBC like test, but as outlined above, one has to avoid breaking molecules into two. The two IMAGe lines tell CHARMM to apply periodic shifts to water on a residue by residue (= molecule by molecule) basis (option BYREsidue, line 3), whereas the protein is shifted as a whole (BYSEgment, line 4) -- in the case of proteins, shifting by residues could break the protein at peptide bonds!

This is it, or rather, this should be it. One would assume that once PBC is set up (via CRYStal / IMAGe as shown), any nonbonded list update would update both the primary-primary and primary-image list, and that CUTNB would be understood as being applicable to the generation of both lists. Alas, not so ... For (lets assume) historical reasons, the update frequency for the two lists (primary-primary, primary-image) are controlled by two different parameters, INBFrq and MGFrq; similarly, there are two cut-off radii, CUTNB for the primary-primary and CUTIM for the primary-image non-bonded lists. Obviously, INBFrq should always equal IMGFrq, and CUTNb should always equal CUTIm. To add insult to injury, there are no mechanism to enforce this equality or to set reasonable defaults. E.g., while INBFrq defaults to -1 (heuristic update, which is good), IMGFrq has a default of 50, which is nonsensical! Thus, it is a sad fact of life that it is the user's responsibility to ensure meaningful values for INBFrq, IMGFrq, CUTNb and CUTIm, otherwise rubbish can and will be produced.

So, for the sake of good practice, let us show the necessary steps to set up PBC for a cubic system and ensure correct energy / force calculations. For good measure, we use a large than default CUTNB/CUTIM, and we request BYCB:

! read RTF

! read parameters (we assume param22/27, which in principle sets up
     ! most defaults correctly, i.e., what we get at this point
     ! corresponds to
     !   NONB nbxmod  5 atom cdiel shift vatom vdistance vswitch -
     !        cutnb 14.0 ctofnb 12.0 ctonnb 10.0 eps 1.0 e14fac 1.0 wmin 1.5
     ! as discussed

! read psf

! read coors

set rc 16. ! value we want for CUTNB/CUTIM
calc xo = @rc + 2. ! add a safety "net" to the CRYStal BUILd CUTOff
     ! NOTE: xo must be at least equal to rc, i.e., SET XO @RC 

CRYSTal DEFI CUBI 60. 60. 60. 90. 90. 90. 

! now issue an ENERgy command to (re)set all defaults!
     NBXMod  5 ATOM CDIEl SHIFt VATOm VDIStance VSWItch -
     CTOFnb 12.0 CTONnb 10.0 EPS 1.0 E14Fac 1.0 WMIN 1.5

! Note: most likely we would use the opportunity to replace SHIFte 
!       electrostatics by PME Ewald, see below

The above should guarantee that all subsequent minimization and or MD calculations generate both nonbonded lists at correct intervals with sane cut-off radii. As noted in the comment, the one unrealistic aspect of the above code snippet is that PME is not set up. Normally, the ENERgy command above would also be used to replace SHIFted electrostatics by PME; however, we first need to introduce Ewald summation and PME.

Ewald summation / Particle-Mesh-Ewald (PME)

Ewald summation and PME for dummies

Some more rigorous background can be found at In the following I comment in (over)simplified form on some aspects of Ewald / PME, which I find is surprisingly often misunderstood in practice.

Ewald summation (ES) is a method to sum up (= calculate) electrostatic interactions in an infinite lattice, or, alternatively to compute electrostatic interactions under periodic boundary conditions. If you want to look at it that way: with ES you take these periodic images "seriously". In symbols, the starting point of ES is to compute the (infinite) sum (the lattice sum)

Here , are the charges of particles i, j, is the vector between particle positions i, j, and is the lattice vector pointing into all periodic images of the primary box. The prime in the sum over the is used to indicate that for (primary box), the case i=j is excluded.

has the unpleasant mathematical property of being conditionally convergent, i.e., its value depends on the order of summation.

In addition to being conditionally convergent, direct computation of the above lattice sum would be extremely tedious since the convergence is also slow. ES is a trick to accelerate the convergence; as a by-product the singularity of the Coulomb interactions responsible for the conditional convergence is avoided. (Thus, ES may be viewed as a modified lattice sum).

In ES the above sum is split into two sums, according to

Here, erf and erfc are the error and complementary error function, respectively. For a suitable choice of , the first term on the rhs is short-ranged, whereas the second term contains all the long-range interactions. More specifically, if is chosen sufficiently large, for the first term only interactions in the primary simulation box need to be taken into account (think of the erfc term acting as a "shifting" function, damping the long range 1/r potential rapidly to zero. Thus, the kernel labelled real leads to the real space sum of electrostatic interactions which is computed normally (using the complementary error function erfc(x) as "shifting" function),

The kernel labelled reciprocal, on the other hand, is long-ranged, and summation here would have to run over all lattice vector n. Long-ranged interactions, however, become short-ranged in reciprocal space (hence the name), and essentially by Fourier transformation the sum resulting from the reciprocal kernel becomes a rapidly converging sum in reciprocal space, referred to as "reciprocal sum" or "k-sum".

The exact mathematics adds a few additional terms and corrections, which CHARMM computes correctly, and hence we can afford to ignore. The computationally intensive parts are the real space and reciprocal space sums just discussed. The former, as pointed out, can be computed within the normal framework of nonbonded interactions, whereas the second is a somewhat funny term, entailing a double sum over atom positions and reciprocal space vectors. We spare you the details since the k-sum is nowadays usually computed by the so-called particle-mesh-Ewald (PME) technique. (Note: one sometimes gets the impression that ES and PME are considered different things / methods. Just to be clear: PME is a fast, efficient way of approximating one term of ES, the k-sum!)

The basic trick of PME consists in the fact that the k-sum is not computed for atom positions, but for fixed positions on a grid. This allows to pre-compute a lot of stuff and to use fast fourier transforms (FFT) to speed things up further. The atomic charges are smeared out on the grid in each step, potential (and forces) computed on the grid and back-transformed to the atom positions. By using so-called B-splines for the "smearing", it suffices to compute the potential; the forces are obtained by differentiation of the splines (this avoids separate FFTs for potential and force calculation). Traditional ES scales approximately as for N atoms; PME scales as , which makes a big difference for large systems.

CHARMM implements both traditional ES as well as PME to compute the k-sum. Historically, the two methods were coded independently, and the older implementation of ES has a number of limitations. Since it is also much slower than PME, only PME is discussed in the following.

Setting up PME

Ewald summation is requested by the EWAL option to ENERgy, and PME is requested by adding PMEWald as an option. While the original ES method had numerous options fine-tuning the calculation of k-vectors, for PME only 4 additional options are relevant. The first is the choice of , the others specify the number of grid points in each of the three spatial directions. A full energy call requesting PME looks like

! set up CRYStal / IMAGes (= PBC), otherwise the following will fail

The above sets to 0.43, and requests a grid. As a rule of the thumb, the grid spacing in each spatial direction should be approximately 1 Angstrom. One has to keep in mind, however, that the FFT implementation of CHARMM has certain limitations concerning the choice of prime factors; essentially, only powers of 2, 3 and 5 are allowed, e.g., FFTX 27 (= 3x3x3) is OK, but FFTX 28 (= 2x2x7) is not. Further, the more powers of 2 relative to powers of 3 and 5, the better (the faster).

The trickier parameter is . Since the real space sum is calculated by the normal nonbonded routines, the cut-off radius CTOFNB is applied. It, therefore, has to be sufficiently large as to ensure sufficient damping that at distances interactions and forces are effectively zero. If the damping is too low ( too small), then you miss interactions and introduce a severe error in your calculations. There are several rules of the thumb to check the correct choice of ; below is the exact criterion to choose For your value of (CTOFNB) and evaluate (e.g., Mathematica!)

To be on the safe side, if your , increase (ideally, ).

Final note for running MD with PME: The implementation of the PME in CHARMM (as well as in most other comparable programs) does not conserve center of mass translation (sum of all momenta is not exactly zero). If no provisions are taken, your system might start to pick up a net translational velocity component. To compensate / avoid this artifact, CHARMM enforces that during MD you take out center of mass translation by specifying a non-zero NTRFrq (reasonable values are from NTRFrq 500 to NTRFrq 5000) [SB: this is what we use; has anyone really looked into this?]

A theoretical PS: ES is mostly discussed in the context of summing up electrostatic interactions in an infinite lattice (see, e.g., the presentation in Allen/Tildesley). A different point of view can be found in the presumably first publication in which ES was used in the context of a computer simulation (Brush, Sahlin and Teller, JCP 45, 2102 (1966), yes the H-bomb Teller!). The Ewald potential is simply the solution to the Poisson equation under periodic boundary conditions.