Water operations (WATER)

Introduction.

The water menu allows for several options that deal with water molecules. Waters are treated somewhat illogical in WHAT IF. This had to be done to be able to save (lots of) memory, and to keep their treatment relatively simply (from a programming point of view). Waters are grouped. All waters that come from one PDB file, or one prediction run, etc. are grouped in one residue. This one residue is the only residue in one molecule. This means that you can either use whole molecule options to work with these groups of water molecules, or use residue bound options.

Waters are always numbered according to the PDB file. So, the water numv-bers are determined by the experimentalists, and not by WHAT IF.

Listing information about waters

List groups of water molecules (SHOWAT)

The command SHOWAT will cause WHAT IF to first execute the SHOSOU command from the soup menu and thereafter show you all groups of water molecules. For each group the number of waters will be given, but no information about individual water atoms will be shown.

Output of the SHOWAT option should roughly look like:

Date= 1996-01-05 12:59:07                                                             

    Contents of the SOUP:
 
Protein .................... : 1
Drug, ligand or co-factor .. : 1
DNA or RNA ................. : 0
Solvent not water or ion ... : 1
(Groups of) water .......... : 1
 
 Molecule      Range             Type            Set name
     1    1 (1    )  166 (166  ) Protein        y
     2  167 (167  )  167 (167  ) GNP            y
     3  168 (168  )  168 (168  ) MG             y
     4  169 (OH2  )  169 (OH2  ) water          y
 
Groups of water molecules in the soup:
 
Molecule number =  4    # of waters = 211

Listing groups of waters (LSTWAT)

The command LSTWAT will cause WHAT IF to automatically execute the SHOSOU command from the SOUP menu. Thereafter it will prompt you for a molecule number. This must be the molecule number that WHAT IF assigned to the group of water molecules. WHAT IF will then show you all atomic information presently available about every water in this group. The last column in this table will hold the so-called original water name. This name in brackets can be used to address individual water molecules.

You can of course also use the LISTA command, and give the residue number of this group of water molecules.

Output of the LSTWAT option should roughly look like:

Group of water molecules ........... : 4
Number of `residues` in this group . : 1
Number of `atoms` in this group .... : 211
 
Atom     X     Y     Z   Acc   B   Wt   VdW COL  Charg AtOK Use  #Water
 OH2   13.6  43.4  17.8  0.0 31.8  1.0  1.4 120  0.00    +   - 1356 (169)
 OH2    8.2  36.1  23.2  0.0 13.6  1.0  1.4 120  0.00    +   - 1357 (170)
 OH2    3.3  37.8  19.5  0.0 18.3  1.0  1.4 120  0.00    +   - 1358 (171)
 OH2    5.9  35.6  20.1  0.0 11.3  1.0  1.4 120  0.00    +   - 1359 (172)
  ......
Etcetera
Under #Water you see two numbers the first is the atom number in the WHAT IF soup, the second one (the one in brackets) is the PDB identifier of this water molecule.

Contact analyses involving water

Several options exist that can aid with the analysis of contacts between water and protein or water and water. Be aware that the ANACON menu, of course, also holds options that can aid with contact analyses. If any of these options prompts you for a residue range, you can also try to give a group of waters. WHAT IF will politely `'-) tell you when you give unallowed input....

The same criteria for contact or non-contact are used as in the ANACON menu. I.e. A contact exists is the distance between the Van der Waals` surfaces of two atoms is shorter than the cutoff. The default value for the cutoff is 0.25 Angstrom.

The output numbers of course depend heavily on the cutoff distances selected!

List waters near one residue (WATNAA)

The command WATNAA will prompt you for a range of residues, and for a group of waters. It will then list all residues that contain atoms that contact water molecules. It will also list for every residue those waters that make the contact, the atoms involved, and for each contact the distance and the hydrogen bond energy (see the HBONDS chapter) are listed. If you want to evaluate water-water contacts between groups of water molecules, you can give water when the residue range is requested... You might however not like the output...

Output of the WATNAA option should roughly look like:

The maximal allowed distance between VdW surfaces is: 0.250
 
Residue:                        1 MET  (1   )-
Atom    X     Y     Z   Acc   B   WT   VdW  Colr   Charg AtOK  Val
N     -7.2  32.9  -6.6  0.0 13.9  1.0  1.7  340    0.00   +    0.00
CA    -5.9  32.1  -6.7  0.0 17.1  1.0  1.8  240    0.00   +    0.00
C     -5.8  31.0  -5.6  0.0 16.7  1.0  1.8  240    0.00   +    0.00
O     -6.6  31.1  -4.6  0.0 16.6  1.0  1.4  120    0.00   +    0.00
CB    -4.7  33.1  -6.5  0.0 17.0  1.0  1.8  240    0.00   +    0.00
CG    -4.6  33.8  -5.2  0.0 18.1  1.0  1.8  240    0.00   +    0.00
SD    -3.1  34.8  -4.9  0.0 19.7  1.0  2.0  180    0.00   +    0.00
CE    -3.0  35.9  -6.3  0.0 20.4  1.0  1.8  240    0.00   +    0.00
 
Atom    X     Y     Z   Acc   B    WT   VdW Col Dist Hene  #Water
N     -7.2  32.9  -6.6  0.0 13.9  1.0  1.7 340  2.9  0.00 1476 (297  )
 
Residue:                        2 THR  (2   )-
Atom    X     Y     Z   Acc   B   WT   VdW  Colr   Charg AtOK  Val
N     -4.9  30.1  -5.8  0.0 14.7  1.0  1.7  340    0.00   +    0.00
CA    -4.7  29.0  -4.9  0.0 16.9  1.0  1.8  240    0.00   +    0.00
  ......
Etcetera
You see every time a residue as if it was listed by LISTA, followed by a listing of the water atoms contacting this residue. many of the zeros in this output can be non-zero if the appropriate options (e.g. SETACC, CHARGE, etc.) were used before WATNAA was called.

List residues near water (NAAWAT)

The command NAAWAT lists water-protein contacts. You will be prompted for a residue range and for a group of water molecules. For every water the nearest atom in any of the residues will be listed. The interatomic distance and the hydrogen bonding energy (see HBONDS chapter) will be listed.

Output of the WATNAA option should roughly look like:

   #Water          X     Y     Z     #RES TYP  #PDB  Atom   D      Hene
  1356 (169  )   13.59  43.39  17.78   30 ASP (30  ) N      2.95   0.92
  1357 (170  )    8.25  36.06  23.21   33 ASP (33  ) N      2.91   0.83
  1358 (171  )    3.25  37.85  19.45   35 THR (35  ) OG1    2.74   0.83
  1359 (172  )    5.95  35.62  20.10   33 ASP (33  ) O      2.68   0.00
  1360 (173  )    3.67  32.23  18.17   58 THR (58  ) O      2.61   0.00
  1361 (174  )    2.90  37.17  15.50   40 TYR (40  ) OH     2.67   0.80
  1362 (175  )    2.10  32.31  24.12   35 THR (35  ) O      2.97   0.00
  1363 (176  )   -1.42  37.65  19.67   36 ILE (36  ) N      2.83   0.77
  ......
Etcetera
Numbers in brackets are identifiers as read from the PDB file. Hene is the hydrogen bond energy of the contact on a scale 0.0 - 1.0 (see the HBONDS chapter). D is the distance between the atoms.

All residues near all waters (NALWAT)

The command NALWAT will create two lists. The first lists all atoms that contact at least one water molecule, together with all water molecules that it contacts. The second list holds just the oposite data, every water that contacts at least one protein atom is listed, together with all protein atoms it contacts. You will be prompted for a residue range and for a group of water molecules.

In both lists all atomic coordinates will be listed for the water and the contacting atom in the residue range. The distance and the H-bond energy on a scale from 0.0 - 1.0 (See HBONDS chapter) are given too.

The output of NALWAT roughly looks like:

#RES TYP #PDB  Atom   X     Y     Z    #Water     X    Y    Z    D    Hene

  1 MET (1   ) N    -7.2  32.9  -6.6 1476 (297) -8.4 34.4 -4.5  2.91  0.69
  2 THR (2   ) N    -4.9  30.1  -5.8 1364 (177) -3.2 30.0 -8.2  2.93  0.88
  2 THR (2   ) CB   -3.6  28.1  -5.4 1364 (177) -3.2 30.0 -8.2  3.43  0.00
  2 THR (2   ) OG1  -4.1  27.6  -6.6 1364 (177) -3.2 30.0 -8.2  3.05  0.41
  3 GLU (3   ) N    -4.9  29.3  -2.5 1384 (198) -6.9 27.0 -2.8  3.03  0.78
  3 GLU (3   ) O    -4.5  27.4  -0.4 1381 (195) -7.0 26.6  0.7  2.84  0.00
                                     1383 (197) -4.3 24.6 -0.9  2.83  0.00
  3 GLU (3   ) OE1  -7.6  32.2   0.7 1486 (312) -6.9 30.7  3.1  2.87  0.00
  4 TYR (4   ) OH   -0.1  25.4  -4.2 1382 (196) -2.0 23.3 -4.8  2.90  0.72
  5 LYS (5   ) CG   -6.1  27.8   4.4 1486 (312) -6.9 30.7  3.1  3.32  0.00
  ......
Etcetera

  #Water     X    Y    Z    D    Hene #RES TYP #PDB  Atom  XYZ
1356 (169) 13.6 43.4 17.8  2.95  0.92  30 ASP (30  ) N    13.6  40.5  18.3
1357 (170)  8.2 36.1 23.2  2.91  0.83  33 ASP (33  ) N     7.5  38.8  23.6
1358 (171)  3.3 37.8 19.5  2.86  0.00  33 ASP (33  ) O     5.0  37.6  21.7
                           2.74  0.83  35 THR (35  ) OG1   2.9  35.1  19.2
1359 (172)  5.9 35.6 20.1  3.38  0.00  17 SER (17  ) CB    6.4  35.3  16.8
                           2.68  0.00  33 ASP (33  ) O     5.0  37.6  21.7
1360 (173)  3.7 32.2 18.2  3.44  0.00  16 LYS (16  ) CE    4.4  28.9  18.9
                           3.39  0.00  35 THR (35  ) CB    1.7  34.5  19.8
                           3.36  0.00  57 ASP (57  ) CG    2.8  33.3  15.1
  ......
Etcetera
If the first columns are empty, than the same atom as above is meant. Numbers in brackets are identifiers as read from the PDB file. Hene is the hydrogen bond energy of the contact on a scale 0.0 - 1.0 (see the HBONDS chapter). D is the distance between the atoms.

Statistics on water positions relative to residues (STSWAT)

The command STSWAT will prompt you for a group of waters and for a range of residues. It will then give several kinds of statistics about the positions of the waters relative to residues, atoms, pairs of residues, etc.

The following tables will be produced:

Per residue type: its frequency; the total number of waters touched by residues of this type; the average number of waters touched by a residue of this type.

In the second table these numbers are split out over the individual atoms.

The third table lists all waters, and for each water the frequency of residue types that have at least one atom within the cutoff radius distance. Also the nearest residue is listed.

Since this option slowly grows over the months, I suggest you just try it...

The output of the STSWAT options starts roughly with:

 RES FREQ # WAT WAT/RES
 ALA    11   23   2.09
 CYS     3    4   1.33
 ASP    14   79   5.64
  ......
Etcetera. 
RES is residue type; FREQ is frequency of occurrence of this residue type in the range you gave, WAT is number of waters that in total contact all residues of this type; WAT/RES is WAT divided by FREQ.
The maximal allowed distance between VdW surfaces is: 2.000
 Res Atom  FREQ  #Wat Wat/Atom
 ALA N      11   12   1.09
 ALA CA     11   16   1.45
 ALA C      11   15   1.36
 ALA O      11   11   1.00
 ALA CB     11   16   1.45
 
 CYS N       3    0   0.00
 CYS CA      3    1   0.33
 CYS C       3    0   0.00
 CYS O       3    2   0.67
 CYS CB      3    0   0.00
 CYS SG      3    1   0.33
 
 ASP N      14   21   1.50
 ASP CA     14   35   2.50
  ......
This table is similar to the first one, however, now individual atoms are separated rather than added up per residue as in the first table.
Listing of number of contacts per individual water
Atom  PDB#   A C D E F G H I K L M N P Q R S T V W Y   Nearest
 1356 (169)  0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0   31 GLU (31  )
 1357 (170)  0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1   34 PRO (34  )
 1358 (171)  0 0 2 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0   38 ASP (38  )
 1359 (172)  0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0   35 THR (35  )
  ......
Etcetera.
In this third table for every water is listed with what residue types it makes contacts (and how many), and with which residue the shortest contact is made.

Distance of atoms to water (CABWAT)

The option CABWAT will prompt you for a residue range. For all atoms in this range the distance to the nearest bulk water molecule is determined. These distances are within 1.0 Angstrom correct.

Using the distances of the C-alpha and the C-beta to water, a coarse estimate is given about the relative orientation of the sidechains with respect to the solvent. They are subdivided in pointing inwards, outwards, or parallel to the surface. This division in three classes is not to be trusted very much.

CABWAT takes a hell of a lot of CPU time. The output roughly looks like:

The water radius = 1.400
Residue:                        1 MET  (1   )-
Atom   X     Y     Z   Acc   B   WT   VdW  Colr Charg AtOK  Val
N    -7.2  32.9  -6.6  0.0 13.9  1.0  1.7  340  0.00   +    3.11
CA   -5.9  32.1  -6.7  0.0 17.1  1.0  1.8  240  0.00   +    3.21
C    -5.8  31.0  -5.6  0.0 16.7  1.0  1.8  240  0.00   +    3.21
O    -6.6  31.1  -4.6  0.0 16.6  1.0  1.4  120  0.00   +    2.83
CB   -4.7  33.1  -6.5  0.0 17.0  1.0  1.8  240  0.00   +    3.34
CG   -4.6  33.8  -5.2  0.0 18.1  1.0  1.8  240  0.00   +    3.39
SD   -3.1  34.8  -4.9  0.0 19.7  1.0  2.0  180  0.00   +    3.63
CE   -3.0  35.9  -6.3  0.0 20.4  1.0  1.8  240  0.00   +    3.26
 
Residue:                        2 THR  (2   )-
Atom   X     Y     Z   Acc   B   WT   VdW  Colr Charg AtOK  Val
N    -4.9  30.1  -5.8  0.0 14.7  1.0  1.7  340  0.00   +    3.19
CA   -4.7  29.0  -4.9  0.0 16.9  1.0  1.8  240  0.00   +    3.31
  ......
Etcetera.

    1 MET  (1   ) Is parallel to surface
    2 THR  (2   ) Is parallel to surface
    3 GLU  (3   ) Is parallel to surface
    4 TYR  (4   ) Is parallel to surface
    5 LYS  (5   ) Points outwards
    6 LEU  (6   ) Points inwards
    7 VAL  (7   ) Is parallel to surface
    8 VAL  (8   ) Points inwards
    9 VAL  (9   ) Is parallel to surface
  ......
Etcetera.

Listing waters that contact two ranges (DBLWAT)

The command DBLWAT will cause WHAT IF to prompt you for two protein ranges, and one group of water molecules. It will then list all waters that make a contact with at least one residue in each of the two ranges.

It is allowed to make both ranges the same, thereby effectively working with only one range.

If the NEWGRP parameter is set to 1 (see PARAMS), the found waters will be written in a separate group of water molecules.

The output of this option is similar to the output of the LSTWAT option; LSTWAT lists all water, this option only those that have a contact with at least one atom in each of the two ranges.

Creating (subset) water groups

Manually creating subsets of water groups (COPWAT)

The command COPWAT will cause WHAT IF to repeatedly prompt you for a range of water molecules. Here you should give the original names of the water molecules (given in brackets in the LSTWAT option output) and not the number of the group of water molecules. Also the WHAT IF given number (this one is normally given just before the original name in LSTWAT output etc.) is not valid input in this case. A range may even span more than one group of water molecules. All water molecules requested will be copied to a new group of water molecules only once, even if they were indicated twice or more. There are no limitations to the order in which the waters are requested; in the new group they will have the same order as they had in the group from which they got copied. They keep the colour they had before.

Moving waters between asymmetric units (DUPWAT)

Often waters are after some refinement and other data handling no longer in the same asymmetric unit as the protein. You can use the DUPWAT command to fix this problem.

You will be prompted for a group of water molecules and an environment. It is strongly suggested to make this environment cover the asymmetric unit, but, of course, excluding the waters !!!

All waters or their symmetry partners that are close to the given environment will be put in a new group of water molecules that is added at the end of the soup. (See also MOVWAT).

Moving waters between asymmetric units (MOVWAT)

Often waters are after some refinement and other data handling no longer in the same asymmetric unit as the protein. You can use the MOVWAT command to fix this problem.

You will be prompted for a group of water molecules and an environment. It is strongly suggested to make this environment cover the asymmetric unit, but, of course, excluding the waters !!!

All waters will be replaced by the symmetry related partner (or itself) that is closest to the given environment. (See also DUPWAT).

Water related parameters (PARAMS)

The command PARAMS will bring you in the menu from which you can change water options related program parameters.

These parameters are:

NEWGRP : flag to determine whether hits are written in a new group
         of water molecules or not.
     0 : Don't write a new group.
     1 : Do write a new group (used by the options MOVWAT, DBLWAT)

NAYWAT   The pickable NAYB option in the screen menu can be modulated 
         with this flag.
     0 : Draw lines to all neighbours
     1 : Draw lines to all neighbours, excluding waters
     2 : Draw lines to neighbouring waters only

VDWDST   Distance cutoff between Van der Waals surfaces in most contact
         options in the WATER menu

NETDST   Atom center - atom center distance cutoff in the NETWAT option

Other water related options

Showing nets of water molecules (NETWAT)

The command NETWAT will cause WHAT IF to prompt you for the number of a group of water molecules. It will then draw lines between every pair of water molecules that is within the default distance (see PARAMS about setting the default distance).

WARNING. Use this option ONLY on real groups of water molecules, and neither on groups of waters generated with the POTWAT option.

Potential water positions (POTWAT)

WHAT IF contains a small database of water positions, relative to amino acid positions. This database was obtained using roughly 20 very high refined proteins. The command POTWAT can be used to obtain potential water positions around one residue. These waters will be positioned as if the one residue is in vacuum. The generated waters will be added to the soup as one group of waters. You can not do much with this option yet, except getting an idea about what are reasonable water positions.

Verivication of water positions (CHKWAT)

The option CHKWAT does the same as the H2OCHK option in the CHECK menu. CHKWAT checks if the waters in the soup are plausibly placed. Warnings are issued for waters that are highly unlikely to be placed correctly in the model/structure.