Surface area calculations (ACCESS)

Introduction.

Many commands are available throughout WHAT IF to work with accessible surfaces. Solvent accessible surfaces, contact surfaces, Van der Waals exclusion volumes and Connolly surfaces can be calculated, reset, shown and summarized. When the solvent accessibility dots or the Van der Waals exclusion volume dots are sent to the graphics they will get the color of the atom they belong to. It is also possible to color atoms at the screen according to their solvent accessibilities. The accessibility menu is activated with the command ACCESS.

Before the individual options are described, some general remarks about the principle of this module are needed.

WHAT IF can NOT keep track of accessibilities when chnages are made to the SOUP. So, after making mutations, insertions, deletions, etc. you have to issue the INIACC command to initialize all accessibility related parameters, and recalculate the accessibilities with SETACC. Don't worry, the accessibility module in WHAT IF in WHAT IF is the fastest in the world...

Also, some remarks are needed on the accessibility calculation algorithm. There is much confusion in the literature about nomenclature. In WHAT IF the following definitions are being used: The contact surface, or molecular surface, is the area at the Van der Waals surface that can be touched by a water molecule (or any probe, you define that with the WATRAD parameter in the PARAMS menu). The accessible surface area is defined by all positions where the center of that water molecule (or probe) can be found. Re-entrant surfaces are neglected by WHAT IF. Nevertheless, the WHAT IF results always come out within a few procent of the, much fancier but much slower, Connolly program values.

At any one time, WHAT IF can only work with one kind of surface: either the contact (molecular) surface, or the accessible surface. The WHAT IF relational database (see chapter on SCAN3D) holds the accessible surfaces and the contact surfaces for the water probe with radius = 1.4 Angstrom.

All accessibility related options use united heavy atoms, and thus neglect all the protons.

Initialize accessibility information (INIACC)

The command INIACC will wipe out all previously gathered information about accessibilities. All flags in the program will be set such as to indicate that no accessibilities have ever been calculated.

Statistics about accessibilities (SHOACC)

The command SHOACC will cause WHAT IF to prompt you for a residue range. For all residues in this range the accessibility will be listed.

Second a list of residue types will be given. For each of the 20 amino acid types its frequency in the given range, and the average of the observed accessibilities for residues of this type in the given range will be shown.

All accessibility related options use united heavy atoms, and thus neglect all the protons.

Calculate accessibilities (SETACC)

The command SETACC will prompt you for a residue range. For all atoms in all residues in this range the accessibility for a probe with a user definable radius (default = 1.4 Angstrom) is calculated. Later the user can both use the individual atomic accessibilities, and the accessibility summed over the whole residues.

The accessibility calculations are done with respect to an environment. You will be prompted for this environment. All molecules that you don't add to this environment are for the accessibility calculations regarded as being absent. Any second calculation of accessibilities will use the same molecules as environment, unless you use the INIENV command in-between. The molecule that holds the residues for which you want to calculate the surface is always part of its own environment.

All accessibility related options use united heavy atoms, and thus neglect all the protons.

The environment

Defining a new environment (INIENV)

The first time you enter the accessibility module, you will be asked to define an environment. Every next time you calculate some accessibilities, this same environment will be used. If you want another environment, you can use the INIENV option. INIENV will wipe out all environment information, so the first time after INIENV has been used that you want to calculate something, you will be asked to define a new environment.

Listing the environment (SHOENV)

The command SHOENV can be used to inspect the present environment. The environment is the list of molecules that are taken into account when accessibility calculations are performed.

Other accessibility calculations

Relative accessibility (VACACC)

The command VACACC can be used to calculate the accessibilities of a residue in a GLY-XXX-GLY tripeptide in vacuum much like described by Cyrus Chotia in `principles that determine the structure of proteins` in Ann. Rev. Biochem, 1984, 537-572. (In case of an N-terminal residue, GLY-XXX is used, and for a C-terminal residue XXX-GLY).

You will be prompted for a residue range. The above mentioned calculation will be performed one by one for every residue in the given range. The values obtained are a good approximation for the accessibility in the unfolded state of the protein.

If the normal accessibilities were calculated prior to execution of this option (with the SETACC command) the relative accessibilities are also calculated. Relative accessibilities are the percentage of the accessibility in the unfolded state, still available in the folded protein.

For every residue WHAT IF will show you the same table as given by the LISTA option, but underneath the accessibility and the unfolded state (or vacuum) accessibility the totals are given. Also two extra columns at the right side are added, one for the atomic unfolded state accessibility, and one for percentage.

All accessibility related options use united heavy atoms, and thus neglect all the protons.

Accessibility of the C-beta (ACCALA)

Often accessibility values do not provide the information one would like to get. One alternative view of accessibility is "What would the accessibility be of the C-beta of an alanine at this position. The optioan ACCALA allows just for this. You will be prompted for a residue range, and optionally for an environment, just as with the SETACC command. However, every residue in the range will get mutated to alanine just before the accessibility of its C-beta is calculated, and it is put back to the original situation immediately after the accessibility calculation.

All accessibility related options use united heavy atoms, and thus neglect all the protons.

Surface analysis

Analyze the surface (ANASRF)

The option ANASRF can only work after SETACC has been executed. ANASRF will do many things, and since we keep working on WHAT IF, it is likely that your version will already do more than is described here.

ANASRF will first cause WHAT IF to calculate the sum of the buried and and of the accessible surface area for the four backbone atoms (N, Ca, C, O). The same numbers are also calculated for the four atom types (C, N, O, S) that can occur in side chains. For the side chains all atoms of a certain type are added up, so for example Ser-O-gamma and Asp-O-delta2 both are added to the bin for O.

If you are in the business of protein design, and are generating large quantities of potential models, you might want to get an impression about the quality of these models. WHAT IF provides many protein structure quality control tools, e.g. RNGQUA in the QUALTY menu, or EVACHI in the CHIANG menu. The option ANASRF will list the summed accessibilities for this range, and list next to it the average accessibility for that residue type in the June 1991 version of the PDB.

Thereafter a per residue the following information will be shown:

Residue number
Residue type
PDB unique identifier
Molecular surface area
Frequency of this residue type in the database
Average accessibility of this residue in database
Standard deviation in this averaged accessibility
Score (whatever that means) for this residue.
At the end the total score will be given, and the total score per residue. The so-called sigma score is a very rough estimator for the total quality of the residues in the inspected range.

You might want to experiment a bit with known molecules to see what this sigma score means.

All accessibility related options use united heavy atoms, and thus neglect all the protons.

Differential accessibility calculations (DIFACC)

If you want to know how much accessible surface is lost upon binding a ligand or another protein, you can use the DIFACC option. This option is rather complicated, and, upon incorrect usage, gives wrong results without any warning.

Proceed as follows:

First calculate the accessibility of the protein, using everything in the environment that is part of the complex, but not the waters. If a few waters are explicitly part of the active site, put them in a separate water group (see WATER menu), and add only that group to the environment.

After that, type DIFACC. This will cause WHAT IF to calculate the accessibilities again, but this time, you add the ligand, or the other molecule to the environment.

Afterwards, you get all kinds of statistics (see SHOACC and ANASRF for an explanation) that are based on the accessibility differences.

Warning. Both before and during the ACCDIF option, you should include ALL residues in the calculation, because the statistics are based on ALL residues.

Store residue accessibilities in a table (TABACC)

The command TABACC will cause WHAT IF to prompt you for a residue range and for a table number. It will then make a reals table out of the total accessibilities per residue for the given range. See the chapter on tables for more information about tables. In short, tables are the tool to make lists with many different kinds of WHAT IF output in it.

Displaying surfaces

Accessible or molecular surface display (GRAACC)

GRAACC calculates for all protein and DNA/RNA molecules the accessible or molecular surface (depending on the parameter). Contacts with other molecules like drugs, co-factors, waters etc. will be taken into account if they are in the environment (See INIENV and SHOENV). If the apropriate symmetry flags are switched on, symmetry related molecules are taken into account too (See chapter SYMTRY).

All accessibility related options use united heavy atoms, and thus neglect all the protons.

Van der Waals surface display (GRAVDD)

GRAVDD calculates for all protein and DNA/RNA molecules the Van der Waals surfaces. Contacts with other molecules like drugs, co-factors, waters etc. will at present still be neglected in this calculation.

All accessibility related options use united heavy atoms, and thus neglect all the protons.

Parameters for accessibility calculations

Showing parameters (SHOPAR)

The command SHOPAR will cause WHAT IF to show you the present values for the program parameters that are important for accessibility calculations.

Changing parameters (PARAMS)

The command PARAMS will as usual bring you to the little menu from which you can change the parameters that are important for this option. In this menu the following parameters are available:

Precision of the calculation (ACPREC)

The ACPREC parameter determines the precision of the accessibility calculations. The number of dots on the surface is a Fibonacci number from the series
1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
The ACPREC+10-th Fibonacci number from this range is the number of dots used. ACPREC can only range from 0 till 4 (is 89 to 610 dots)

Radius of the probe (WATRAD)

The parameter WATRAD determines the radius of the probe used for the accessibility calculations. Be aware that changing WATRAD will change future calculations of accessibilities, but not those that were done before the parameter change.

WARNING. You should not change WATRAD between calculating the accessible surface and displaying it. After changing WATRAD, the accessible surface, Vanderdot surface, and/or contact surface have to be re-calculated.

Contribution to accessibility (OUTACC)

The parameter OUTACC is at present inactive, but will at one day be used to allow you to write out which residues contribute most to the inaccessibility of the residue being calculated.

Surface calculation type (ACCTYP)

The parameter ACCTYP determines what kind of surface is being calculated. ACCTYP=0 directs WHAT IT to calculate a contact surface area. ACCTYP=1 calculates the accessible surface area (area where the center of a water probe can be found that touches the atom).

Limit for being buried (LIMBUR)

At several places throughout the program you can select options to work on buried residues only. In those cases where WHAT IF does not prompt you for the amount of accessible surface area, this parameter is used. Total accessible surface less than LIMBUR is called buried.

Exclude own molecule from calculation (USESLF)

If you want to calculate how much accessible surface is lost on one molecule solely due to another molecule, you can use the USESLF flag. This makes that if you calculate the accessibility for one molecule, the accessibility for each atom is calculated as if it was the only atom in the whole molecule. This option produces rather artificial values, but can sometimes be useful to evaluate the differences of docking results.

Algorithm

The following procedure is followed when calculating the accessibilities:

1) Dots are put at the Van der Waals radius (VDD-options) or at the sum of the Van der Waals radius and the radius of a water molecule (default =1.4 Angstrom) (for ACC-options).

2) Every dot gets the value 1.

3) Every dot that falls within another sphere gets the value 0.

4) The sum of the values of all dots, divided by the total number, and multiplied with the surface area of the sphere is used as the VDD-value, or the ACC-value.

The dots are placed using a Fibonacci algorithm, which ensures that they are placed on the surface as homogeneously as possible. Since WHAT IF uses 233 dots per surface as the default, and for its databases, the expected precision is roughly 5 percent. Much larger errors however, are introduced by the choice of Van der Waals radii... You can see and change WHAT IF's Van der Waals radii that using the SETVDW menu.

The accessible surface is not to be compared with the well known Connolly surface since reentrant surfaces are not calculated. The WHAT IF molecular surface and the Connolly surface however, agree very well.

Activating more commands (MORE)

Not all commands are immediately active in the ACCESS menu. By typing MORE, more commands will be activated. (Use LESS to deactivate the extra commands again).

Other (hidden) commands

Hidden command (DEBUG)

The accessibility menu knows one hidden command. It is only useful for programmers. DEBUG can be used to toggle a debug flag On/Off. The debug flag controls the amount of output.