WHAT IF needs coordinates. Without coordinates the program is still a nice
database handler, and it can tell you what time it is, but without good
coordinates there is not much need for using WHAT IF.
WHAT IF can read and write PDB-files (Brookhaven protein data bank
format) and GROMOS files and it can read DIANA files.
The central data structure in WHAT IF is the so-called 'SOUP'. The
SOUP is an assembly of water with
all molecules in it. WHAT IF knows five kinds of molecules:
1) protein;
2) drugs/co-factors;
3) DNA/RNA;
4) single atomic molecules;
5) (groups of) water molecules.
Because WHAT IF can only work with a finite number of molecules at one
time, water molecules are
taken together as one molecule, consisting of all the water molecules that
came from one source (eg. one input file, or one water position prediction).
The SOUP thanks it name to the fact that it consists of molecules
floating around in water. However, there do not necessarily have to be water
molecules present.
The menu that is activated with the SOUP command allows you to manipulate
the SOUP. The WATER menu performs the addition or deletion of water in case
you want to add or delete them by number. Special operations on water
molecules (like automatic addition or deletion) are also performed from
the WATER menu (see chapter WATER).
Rather often this writeup refers to residues as input to an option,
in many instances however,
the input can also be drugs, and sometimes also solvent. In these cases
there is not always clear documentation about it. Normally you can keep in
mind that if it is chemically sensible, WHAT IF will allow for it. In any
case, just try it. WHAT IF will not crash in case you try something that
is not allowed.
Unfortunately the entire 'Who is who' in biomolecular computing,
crystallography and biophysics has once written the one and only universal
standard for coordinates. We therefore need an almost infinite number
of options to read or write coordinate files. Most of these option
have to do with interfacing to specific programs. These options are
described in the chapters that deal with these interfaces. It is envisaged
that a general coordinate reader will be provided for WHAT IF before version
6.0 is ready (December 95).
The command GETMOL is the general way of getting coordinates from a PDB
file into memory.
(With GETGRO you can read GROMOS formatted coordinate files).
This is a command from the general menu, which means that you can execute
it from every menu. You will be prompted for the name of the PDB file
and thereafter the symmetry matrices, and ALL coordinates are
read from this file and ADDED to the soup. If you want to start with an
empty soup, you should first execute the INISOU command from the SOUP
menu.
There are many ways to write coordinates to a file. Many options do so
automatically (eg. SHOHST, SPLINE, REFI, etc.). The generic command
however is MAKMOL in the soup menu. This command writes a PDB file.
Most coordinate related options are present in the SOUP menu. The commands
GETGRO and GETMOL can be executed from ALL menus.
Whenever WHAT IF adds coordinates to the soup these coordinates need a set
name. This set name is very handy if you want to remember which molecule in
the soup came from which input file. If you are prompted for the set name
and you hit just return, the set name will be made identical to the file name.
The command GETMOL will cause WHAT IF to prompt you for a PDB file. It will
then read all coordinates from this file, and add them to the soup.
If the file is not found in your local directory,
but it exists in the central PDB directory on your machine,
you will be asked if you want to use this PDB file instead. Make your local
WHAT IF manager aware of the the notes
on the configuration files if WHAT IF can not find the standard PDB directory.
(The standard PDB directory must be put in the CCONFI.FIG file).
The command GETGRO will cause WHAT IF to prompt you for a formatted
GROMOS coordinate file. It will
then read all coordinates from this file, and add them to the soup.
The command MAKMOL is the only correct way to write PDB files. You will be
prompted for a template coordinate file. The header of this template
file will be copied to the output PDB-file. Thereafter you will be prompted
for the name of the PDB-file to be created. Last you will be prompted for
the residue ranges.
The command SAVSOU will cause WHAT IF to prompt you for a save-file number.
If will then create a file (numbered as requested) and puts all presently
available data in the soup (molecules, residues, atoms, secondary
structure, accessibilities, etc.) in this file. You can later use RESSOU
to restore the soup from this file.
If you have previously saved the soup in a save-file with the SAVSOU command,
you can use the RESSOU command to restore the soup from that save-file. Be
aware that RESSOU will first destroy ALL data presently in the soup.
The command SAVSTA will cause WHAT IF to prompt you for a save-file number.
If will then create a file (numbered as requested) and puts all presently
available data in the soup (molecules, residues, atoms, secondary
structure, accessibilities, etc.) in this file. So far all is similar
as for SAVSOU, but SAVSTA additionally tries to save the interactive
status (scale, translation, view etc., labels, objects on/off etc.).
You can later use RESSTA
to restore the status from this file.
If you have previously saved the status in a save-file with the SAVSTA command,
you can use the RESSTA command to restore the status from that save-file. Be
aware that RESSTA will first destroy ALL data presently in the soup.
WHAT IF was originally designed to work without explicit protons.
We are presently adapting the program to accept protons as independent
atoms. This can not be done overnight. Many options presently can
deal with explicit protons correctly. Several options not yet.
If you want to use explicit protons, give the following magical
command as the first command in a WHAT IF session:
SETICO 29 1
Be aware however that several options will not (yet) treat the protons
correctly yet, and some options will even create a stack-dump if used
with the proton option active.
The protonisation is expected to be finished by mid 1996.
See the command ADDHYD in the refine menu for `dreaming` proton
coordinates.
The command SOUP brings you in the menu from which you can manipulate the
SOUP. At present SOUP consists of water with molecules in it. These molecules
can be protein, DNA/RNA, non-water solvent, or drug. Everything not recognized
by WHAT IF will be called drug. So, co-factors like FAD, or complex solvent
molecules like MPD will be called drugs. Ions like Cu2+ Ca2+ etc. will be
called non-water solvent molecules.
The commands in the SOUP menu can be logically grouped as follows:
1) look at the SOUP;
2) cut or paste proteins;
3) delete or insert molecules or residues;
4) save or restore amino acids;
5) cys-cys bridge related options.
6) other options.
In the SOUP menu you will find the command MORE. This command can be used to
increase the number of options in the SOUP menu. Normally only the most used
commands in this menu are visible, but MORE will also make the less frequently
used options visible in the menu.
The command SHOSOU will cause WHAT IF to show you the contents of the SOUP. The
number of molecules will be shown, as well as their names. The molecules will
be divided in the following classes: -1 = undefined; 0 = indicative of a
program bug; 1 = protein; 2 = drug; 3 = DNA/RNA; 4 = solvent, non-water;
5 = water.
The ranges of residues spanned by molecules and the total content per molecule
class are also shown.
WHAT IF decides whether two residues are covalently bound by looking at the
distance between the alpha carbon coordinates. Sometimes it makes multiple
molecules out of one protein when you don't want that. The cut and paste
commands are available to overrule WHAT IF's ideas about this. Also it
is nice to fool WHAT IF sometimes by telling that all proteins are one big
molecule shortly before you run an option that can only work on one
molecule at a time.
The command PASTE will cause WHAT IF to prompt you for the C-terminal
residue of a molecule. It will then paste this residue and the N-terminal
residue of the next molecule in the soup, thereby making one molecule out
of the two. If you try to paste at a position where you previously placed a
cut-mark (see CUT), only this cut-mark will be removed and WHAT IF will
automatically determine whether there will be a chain break or not. If you
want to be sure that a paste-flag is set in such a case, you should paste
at the same place twice.
The command PASTAL will cause WHAT IF to execute the PASTE command (see above)
automatically for all proteins in the SOUP. PASTAL will first execute the
INIPAS command (see below), so all previously set cut-flags and paste-flags
are removed first.
The command CUT will cause WHAT IF to prompt you for a residue number. It
will then act like a protease at the C-terminal side of this residue. Thus
if this was not the C-terminal residue of a molecule, the molecule you are
cutting will change into two molecules. If you try to cut at a position
where you previously placed a paste-mark (see PASTE), only this paste-mark
will be removed and WHAT IF will automatically determine whether there will
be a chain break or not. If you want to be sure that a cut-flag is set in
such a case, you should cut at the same place twice.
The command INIPAS will cause WHAT IF to remove all manually set cut and
paste flags. It will thereafter re-determine what it thinks are independent
molecules and what not. Hereby it uses solely distance criteria. Also two
molecules that are in the soup separated from each other by a third one can
never become one molecule, no matter how close they are in space.
The command SHOPAS can be used to list all presently set cut and paste flags.
If you want to try mutations (see mutating residues) you often might want to
go back to the original situation later. You can of course every time write
in between PDB-files, but there is also the possibility to save and later
restore residues. This is a much faster procedure, and it costs less disk space.
The command SAVAA will cause WHAT IF to prompt you for the number of a residue.
It will then write the residue in a file. You can later restore this residue
with the RESAA command.
The command RESAA will cause WHAT IF to prompt you for the number of a residue.
You will also be prompted for the type of residue you want to insert. This must
be the type that was used during the SAVAA operation.
It will then add this residue from its file
into the soup immediately after a residue for which you will be prompted.
If you want to
replace the residue in the soup with the restored residue, you should
delete that residue in the soup, and insert the saved residue after the residue
N-terminal of the one you are replacing. You can either first
restore the previously
saved residue after residue N in the soup, and then delete residue N, or
first delete residue N, and then insert after N-1.
The real WHAT IF hackers can abuse the SAVAA and RESAA options to do rather
complicated modifications of molecules.....
There are many ways to correct, delete, insert, or mutate
amino acids, from many menus
throughout WHAT IF. Direct correction, deletion and insertion
operations can only be performed from the soup menu.
WARNING: many parameters are no longer correct after changes have
been made in the
soup. These parameters involve ROWS, H-BONDS, CUT and PASTE flags, DGLOOP
groups, SALT BRIDGES, or more general, all information that depends on
(pointers to) amino acids.
The following commands are available:
This commands removes all molecules from the soup. Other parameters like
groups, matrices, maps, etc. will remain untouched. The INISOU command
is irreversible!
This command causes WHAT IF to perform the SHOSOU command first, and then prompt
you for the number of the molecule to be deleted. If you give molecule 0 nothing
will be deleted.
This command causes WHAT IF to perform the SHOSOU command first, and then prompt
you for the numbers of the molecules to be deleted. If you give molecule
0 nothing
will be deleted.
The command DELETE will cause WHAT IF to prompt you for a residue number.
That residue will than be deleted from the soup, without any structural
corrections in the environment.
The command CORAA will cause WHAT IF to prompt you for a residue range. All
atoms in this range that are missing will be created by WHAT IF, provided that
at least the backbone N, C-alpha and C are present. You will be asked by
WHAT IF if you also want to correct bad inter atomic distances. If you answer with
YES, WHAT IF will move atoms around till the bad inter atomic distances are
better. However, this option will also displace some atoms that are actually
placed correctly, and that might not be desired.
Don't worry about all
kinds of error messages. These are caused by errors which when elsewhere
in WHAT IF occurring, are fatal, but here don't matter too much.
Be aware that this option only accepts amino acids.
The command CORALL will cause WHAT IF to execute the CORAA option without asking
for the range, because it assumes that all amino acids in the soup should
be corrected (at least those that are wrong). All
atoms in this range that are missing will be created by WHAT IF, provided that
at least the backbone N, C-alpha and C are present. You will be asked by
WHAT IF if you also want to correct bad inter atomic distances. If you answer with
YES, WHAT IF will move atoms around till the bad inter atomic distances are
better. However, this option will also displace some atoms that are actually
placed correctly, and that might not be desired.
Don't worry about all
kinds of error messages. These are caused by errors which when elsewhere
in WHAT IF occurring, are fatal, but here don't matter too much.
Be aware that this option only works on amino acids.
The command CNTBAD will cause WHAT IF to look at all residues in
the soup. It will count all residues that it thinks are perfect, and all
that it thinks are bad. It will list all bad residues.
WHAT IF normally determines which cysteines are bridged by simple distance
criteria. Every pair of cysteine S-gammas closer than 2.5 Angstrom trigger
a cys-cys bridge.
There are a few commands to manipulate this.
The command SHOCYS will cause WHAT IF to list all cysteine bridges presently
known to it. This includes the self determined ones, and the user set
cysteine bridges.
The command SETCYS will cause WHAT IF to prompt you for the first and for the
second cysteine in a cys-cys bridge. This can of course only be done if there
are at least two unpaired cysteines available.
The command INICYS will cause WHAT IF to remove all flags for manually set
cys-cys bridges, and set all cys-cys bridges according to distance criteria
again.
The following commands are also available from the soup menu:
At present WHAT IF treats C-terminal oxygens still as single atomic individual
molecules. This will be changed in version 6.0. However, till that time, you can
use the ADDOXT command to add C-terminal oxygens where needed. This is for
example needed after you remove one or more residues, and create new
C-termini.
The command GETDBF can be used to get a protein from WHAT IF's relational
structure database in the soup. The command GETDBF will cause WHAT IF to
prompt you for the number of a database file. You can use the INDEX command
in the SCAN3D menu to see which proteins all are available.
You will be asked if you want to initialize the soup first. If you answer
with YES, the command INISOU (see above) will automatically be executed first.
If you answer with NO, the requested protein will be added to the soup.
The command MAKDNA will cause WHAT IF to display a mini menu that allows
you to create a DNA molecule. Further information will be provided as soon
as this option is bug free. Till that time, use MAKDNA with great care.
The command NEWUNQ will cause WHAT IF to renumber the unique identifiers
(=PDB identifiers) for the residues in your soup. They will be numbered
1, 2, 3, ... etc. You can use RENUMB if you want alternative numbering
schemes.
The command SETCHA will cause WHAT IF to prompt you for a range(s) of residues
and for a (new) chain identifier. A chain identifier must be a single character.
It will give all selected residues the chosen chain identifier.
Be aware that this option can get you in deep trouble....
If you give the first half of a chain a different chain identifier from the second
half, you actually converted that one chain into two chains. Every character
is allowed as chain identifier. WHAT IF has no problems with that, but the
official PDB nomenclature only allows for capital A-Z, and several other
programs might count on you using only those chain identifiers.
If you give two disconnected chains the same chain identifier than a few WHAT IF
options might start giving funny results, and other programs will become
unpredictable.
In summary, this is an option that requires some thinking....
The command SOUCOP will cause WHAT IF to prompt you for a range of residues.
It will then make an exact copy of this range after the last protein in the
soup.
This is a nice option for rearranging your soup without the usual edit
procedures. It is also a useful option for loop transplants.
The following options are so-called hidden options:
The command CLNSOU removes all drugs, co-factors, water, ions, etc. from the
soup. Also, in case proteins and/or DNA/RNA overlap severely in space, the
molecule with the highest number in the soup gets deleted. This is a rather
harsh and irreversible option. Use SAVSOU before you use this option?
One of the most common errors in the residue nomenclature in PDB-like files
is the addition of a fourth character to it (e.g. HISA, ASPH). The GETUS3
command can be used
to overcome this problem.
The command GETUS3 will cause WHAT IF to prompt you for a PDB file. It will
then read all coordinates from this file, and add them to the soup.
The fourth character of the residue name will be skipped upon reading.
If WHAT IF gets confused it sometimes starts spitting incomprehesible
messages at you such as "Soup out of sync". These messages are mainly meant
for us, but that does
not help you much, because your session is about to crash. The best thing to do
in such cases is to run the STATUS command. That produces a lot of
seemingly useless output, but it might rescue your session. After STATUS,
try to use MAKMOL to save your soup, kill WHAT IF, and start again.
This is mainly a debug routine. The very experienced user might read the
comments in the routine MOL010 to see what kind of pointers are all listed.
Sometimes DNA molecules are present in the PDB file in the wrong order (i.e.
the last residue is given first). In these cases INVERT can be used to
invert the order of the bases in the molecule. WHAT IF is not very
clever when dealing with DNA (mainly because I never work with DNA), so
if WHAT IF gets confused about DNA molecules, try this option.
Alternatively, use the FIXDNA option.
By the way, you can also use this option (without any guarantees) on
stretches of protein....
The command MERGED allows you to merge multiple drugs into one single
drug molecule. This is a handy option if you run out of possible molecules
in the soup because of billions of single ions or something similar.
If you want to delete a base pair from the soup, that might be rather
cumbersome work because you have to do a lot of residue number calculations.
With the DELDNA option you can delete an entire base pair by proving the
residue number of just one of the bases.
Sometimes DNA molecules are present in the PDB file with the wrong residues
(i.e. the O3* sits at the wrong base). In these cases FIXDNA can be used to
correct the positions of O3* atoms in the molecule. WHAT IF is not very
clever when dealing with DNA (mainly because I never work with DNA), so
if WHAT IF gets confused about DNA molecules, try this option.
Alternatively, use the INVERT option.
The command SHOTOP will cause WHAT IF to show you most information that it
obtained from the last topology file that was read in. This is normally
the topology file that get read automatically upon starting WHAT IF.
The command DVADOM will force WHAT IF to overrule its internal
determination of which atoms are bad, and which treat them all
as OK. You can see if atoms are bad when you type LISTA. The
AT OK column has + for good atoms and - for bad atoms.