WHAT IF allows you to correlate atomic parameters. This can be handy if you want
to get the following types of questions answered:
How many potential hydrogen bond donors are buried, but not involved in a
hydrogen bond?
How often is an internal water molecule in between two acidic groups In my molecule.
The principle is that the atomic parameters are converted into rows of logicals.
The elements of these rows are true if the corresponding atom meets the
requirements set by the user. (Eg. involved in a hydrogen bond with donor-
acceptor distance shorter than 3.2 Angstrom, and the angle not deviating more
than 45 degrees; or the atom is in a residue with total surface accessibility
less than 5.0 squared Angstroms). These rows can be logically combined. The
results can be analyzed in many different ways.
An article written on these parameter correlation rows has been added as
an appendix.
Many options in this menu will work flunky or even incorrect if you have
overlapping molecules in the soup (e.g. superposed molecules, or a set
of NMR structures).
See the LSCRIP option if you want to repeat a query over a large set of
molecules (e.g. the whole PDB, or a set of NMR structures).
The option LSCRIP (see the chapter on SCRIPT) can be used to make a script
run over a whole set of PDB files. The SEARCH menu allows you to do more
complicated queries over the whole PDB than you can do for example with
teh SCAN3D menu, or with other database systems. It will be slow, but it
can do everything...
The command ROWHBO will calculate all hydrogen bonds for all residues
in the soup. All atoms potentially involved in at least one hydrogen bond
will be marked 'true'. See the HBONDS chapter about the difference
between potential and real hydrogen bonds. You will be asked if hydrogen
bonds with water should be included in the calculations. If you answer YES,
all experimentally determined water molecules will be included. If you
also want to include bulk water, you should use the ROWACC option and
mark all atoms with non-zero accessibility.
The command ROWSBR will cause WHAT IF to search for salt bridges in your
protein. Only salt bridges between amino acids will be searched for. All atoms
involved in a salt bridge will get a flag set in a row. A salt bridge is
defined as an acidic oxygen and a basic nitrogen being within a certain
distance. This distance has a default of 5.0 Angstrom. See the chapter on
parameter setting if you want to change this. WHAT IF will ask you whether
you want histidines to be considered basic or not. If you call them basic
then both side chain nitrogens will be considdered basic at the same time.
The command ROWPOL will create a row in which all polar atoms (those
are the nitrogens and oxygens in the sidechains of Arg, Lys, His,
Asp, Asn, Glu and Gln) are set to TRUE.
The command ROWACC will cause WHAT IF to prompt you for ATOMS or ACIDS. By this
it means that the limits to be set on the surface accessibility are for
individual atoms, or for the sum of all atoms in the residue.
You will thereafter be promted for the surface accessibility limits. Here you
have to give two numbers (The defaults are 0.0 and 0.0; meaning completely
buried). These limits are the lower and upper accessibility values between
which the row element for that atom will be set to true. In case ATOMS were
selected, the limits are applied straight forward to every atom. In case
ACIDS (meaning amino acids) were selected, WHAT IF first calculates the total
surface accessibility for the whole residue, if that value is between the limits
given, then the row elements are set to true for all atoms in the residue.
Remember that surface accesibility is defined as the area of the sphere on
which the center of a water molecule that touches the atom can be found.
See the chapter on surface area calculations for the
algorithm used to calculate these areas.
Using the default parameters, an accessible surface of 0.1 Angstrom is
already enough to allow for a hydrogen bond with bulk water.
The command ROWGAC will cause WHAT IF to prompt you for residues.
You can then give real residues, or self made residues (see SHOEAA, SETEAA
etc.) or combinations thereof. All atoms in every copy of
these residues (or in every real residue that
is part of a requested self made residue) will get their
row element set to true.
So, for example, if you answer the question about which residues to label
with `SER ALA BIG`, all atoms in all ala, phe, his, lys, met, arg, ser, trp
and tyr residues will be marked with true.
If you give only one kind of residue, you can select individual atoms. If
you use more residue types at the same time, you can only select ALL, BACK,
or SIDE for all atoms, backbone atoms or sidechain atoms respectively.
You will thereafter be promted for the surface accessibility limits. Here you
have to give two numbers (The defaults are 0.0 and 0.0; meaning completely
buried). These are the
limits on the combined accessibility (per residue) for all atoms that you
selected. If the accessibilities have not yet been determined, WHAT IF will
activate the SETACC command in the ACCESS menu directly upon starting
this option.
Remember that surface accesibility is defined as the area of the sphere on
which the center of a water molecule that touches the atom can be found.
See the chapter on surface area calculations for the
algorithm used to calculate these areas.
Using the default parameters, an accessible surface of 0.1 Angstrom is
already enough to allow for a hydrogen bond with bulk water.
The command ROWCAV will cause WHAT IF to prompt you for ATOMS or ACIDS. If you
give ATOMS, all row elements belonging with an atom that makes up part of the
wall of a cavity will be set. Giving ACIDS will cause WHAT IF to set all
row elements for a residue if at least one of its atoms makes part of
the wall of a cavity. You will also be prompted for the probe radius used
while making the cavity map with the CAVITY option in the MAP menu.
This option is not full (fool?) proof. If you did not run the CAVITY option
in the MAP menu, this option will not execute correctly.
The option ROWPDA sets all row elements belonging with nitrogen or oxygen
atoms to true. ROWPDA stands for ROW Potential Donor or Acceptor.
If you give the command ROW1AA, WHAT IF will prompt you for residues.
You can then give real residues, or self made residues (see SHOEAA, SETEAA
etc.) or combinations thereof. All atoms in every copy of
these residues (or in every real residue that
is part of a requested self made residue) will get their
row element set to true.
So, for example, if you answer the question about which residues to label
with `SER ALA BIG`, all atoms in all ala, phe, his, lys, met, arg, ser, trp
and tyr residues will be marked with true.
If you give only one kind of residue, you can select individual atoms. If
you use more residue types at the same time, you can only select ALL, BACK,
or SIDE for all atoms, backbone atoms or sidechain atoms respectively.
The command ROW1AT will cause WHAT IF to first execute the ROW1AA option.
Thereafter you will be promted for the atom names. If you only gave one
single residue (not a self-made residue) you can give individual atom names.
If you gave several residues, you can now only give ALL (meaning all atoms
in all residues, which makes this option identical to ROW1AA), BACK to
only use the back bone atoms, or SIDE to only use side chain atoms.
A row will be created in which the flags are
set to true for every atom that was given.
If coordinates for H2O molecules are present in the soup you can use ROWNOH
to set a row for atoms that are closer than a certain distance to a water molecule.
Nothing is done with the orientation of the water molecule with respect to
the residue, or the atoms. The distance used is the distance between
the Van der Waals` surfaces. The default distance is 0.25 Angstrom.
If coordinates for co-factors are present in the soup you can use ROWNCF
to create a row for atoms that are closer than a certain
distance to a co-factor. Nothing is
done with the orientation of the atom with respect to the co-factor. The
distance between the Van der Waals' surfaces is used.
The default distance is 0.25 Angstrom. All single atoms that are not water
(e.g. metal ions) and all co-factors in the soup are used.
The command ROWBFT will cause WHAT IF to prompt you for a range of
crystallographic B-factors. All atoms that have their B-factors within
this range will get the corresponding logical in the row set to TRUE.
Sometimes other programs can do things of which you would want that whatif did
them too, because you could use the extra info in the parameter correlation
searches. Well, don't worry. The GETVAL option allows for that. Just let the
other program write a file (N lines F10.0 each) with one value per atom.
The atom order should of course be the same (IUPAC atom order) as WHAT IF
uses. If you now use the GETVAL option, you will be prompted for the name
of this value file, and for the range of values. WHAT IF will now read one
value from that file for every atom in the soup, and set the corresponding
logical in the row to TRUE for every atom for which the value read falls within
the given range.
Residues sitting in one of the three N- or C- terminal positions of a helix
are called helix capping residues. Normally you want GLU (or ASP) at the
N-terminal site, and ARG or LYS at the C-terminal site. This way you make
use of the helix dipole. That is the reason for the name of this option
ROWDIP where DIP stands for dipole. You are prompted for N-caps and for C-caps.
If you answer those questions with YES then the residues in N- or C- cap
position will get all their atoms tagged in the row. If you answer twice
with NO, no row will be generated. You will afterwards be prompted for the
number of the row.
The command ROWHST will cause WHAT IF to evaluate the secondary structure if
that has not been done yet (see commands SETHST or SHOHST). Thereafter you
will be prompted for a secondary structure element (Helix, Sheet, Turn, or
Coil). All atoms in all residues in the main sequence (the one for which you
have coordinates) that are determined to have that type of secondary
structure are set to true.
The command ROWHSP will cause WHAT IF to prompt you for the name of the
HSSP file that corresponds to the present contents of the soup. From this
file it will read the mutability factor. You will be prompted for a lower
and an upper mutability value. All residues for which the mutability falls
within this range will get all their atoms tagged TRUE.
The command ROWMAN will cause WHAT IF to keep prompting you for residue
ranges till you give 0 (zero). All atoms in all residues within these
ranges will be set to TRUE.
The command ROWCON will cause WHAT IF to prompt you for some information.
First you will be prompted for the ranges with which the contact should
take place. Just give one or more ranges, finish with zero.
Second, you will be asked if intra range contacts should be used too. If
you say NO then no atom in the given range will be tagged at all. If you say
YES, then probably many atoms in the given range will be tagged too, because
most residues have some contacts with their covalent neighbours. Atomic
contacts between covalently linked atoms are never taken into account.
Then
you will be prompted for ATOMS or ACIDS. If you say ATOMS then only atoms that
make a contact with the given range will be tagged. If you say ACIDS, then
a whole residue will be tagged as soon as one atom in it makes a contact
with an atom in the given range. Finally you will be prompted for the contact
distance. Two atoms are considdered making a contact if their distance minus
the two Van der Waals radii minus the given cutoff is less than zero.
The command ROWCON will cause WHAT IF to prompt you for some information.
First you will be prompted for the ranges with which the contact should
take place. Just give one or more ranges, finish with zero.
Second, you will be asked if intra range contacts should be used too. If
you say NO then no atom in the given range will be tagged at all. If you say
YES, then probably many atoms in the given range will be tagged too, because
most residues have some contacts with their covalent neighbours. Atomic
contacts between covalently linked atoms are never taken into account.
Third you will be prompted for the row that holds the constraints on
the given range. This means that only those atoms will be looked at in the
given range that are tagged TRUE in the row you give. So a contact between
an atom somewhere, and an atom in the given range that is not tagged, will
simply be considdered as not being a contact.
Then
you will be prompted for ATOMS or ACIDS. If you say ATOMS then only atoms that
make a contact with the given range will be tagged. If you say ACIDS, then
a whole residue will be tagged as soon as one atom in it makes a contact
with an atom in the given range. Finally you will be prompted for the contact
distance. Two atoms are considdered making a contact if their distance minus
the two Van der Waals radii minus the given cutoff is less than zero.
Once several rows of atomic parameter flags have been generated, the nice part
of working with rows comes. They can be logically combined just as groups can.
The way rows can be combined, is sligthly different from the way this is done
with groups, because of the different nature of what is in them.
The following options can be used to logically combine rows: ROWAND, ROWOR,
ROWNOT, and ROWXOR.
Several other operations that only operate on one row are available too.
ROWAND will prompt you for two rows. It will then generate a row having the
elements set to true for every atom that has its elements set to true in both
two input rows. You will then be prompted for the output row number. Giving
zero here will cause WHAT IF to throw the row away.
ROWOR will prompt you for two rows. It will then generate a row having the
elements set to true for every atom that has its elements set to true in either
one of the two input rows. You will then be prompted for the output row number.
Giving zero here will cause WHAT IF to throw the row away.
ROWNOT will prompt you for two rows. It will then generate a row having the
elements set to true for every atom that has its element set to true in the one
row, but not in the other. So the element should either be set to true in the
first input row, or in the second, but not in both two input rows. You will
then be prompted for the output row number. Giving zero here will cause
WHAT IF to throw the row away.
ROWXOR will prompt you for two rows. It will then generate a row having the
elements set to true for every atom that has its element set to true in the
first row, but not in the second. You will then be prompted for the output row
number. Giving zero here will cause WHAT IF to throw the row away.
ROWINV changes every .true. in a row into a .false. and vice versa. You will
be prompted for the output row number. This can be the same as the input row
number. Giving zero will cause WHAT IF to do nothing.
The command ROW1TA will cause WHAT IF to prompt you for one row, and a residue
range. The resultant row will have the same atoms tagged outside the given
range as the input row. Within the given range all atoms will be set true in
every residue that has at least one atom true.
The command ROW1TO will cause WHAT IF to prompt you for one row, and a residue
range. The resultant row will have the same atoms tagged outside the given
range as the input row. Within the given range all atoms will be set false in
every residue that has at least one atom false.
The command ROW0Ta will cause WHAT IF to prompt you for one row, and a residue
range. The resultant row will have the same atoms tagged outside the given
range as the input row. Within the given range all atoms will be set true in
every residue that has all atoms false.
There are several way to inspect the results of the above mentioned operations:
The command ROWSHO will cause WHAT IF to show you all presently active rows. For
every row it will show you the number of the row, the number of elements set to
true in this row, and the way in which this row has been created.
The command ROWHIT does almost the same as the command LISTA. It shows all amino
acids with all their atoms for a user defined range of amino acids at the
terminal (and with the log-option switched on (see DOLOG and NOLOG) also in the
log file of course). The only difference is that the seven-th column will now
show two esclamation marks ( !! ) for every atom for which the flag is set in
the requested row.
The command ROWTAB will cause WHAT IF to prompt you for a table number, a
residue range and a row number. Every residue with at least a hit for one
atom will be written in the table. Residues without hits in them are
written as blanks.
If you want to see only the residues with a hit in it you can use the command
ROWHTO. You will be prompted for the row number, and the range of amino acids.
WHAT IF will then go over all amino acids in that range and show those that
have at least one hit in it.
The command ROWHPR will cause WHAT IF to prompt you for a row and a residue
range. For every residue in the range it will give one line of output
consisting of the residue and its type and its name, followed by
the number of hits in this residue, and the maximal number of hits
in this residue type. The latter is of course equal to the number of
atoms in this residue.
The command ROWHP1 will cause WHAT IF to prompt you for a row and a residue
range. For every residue in the range that has at least one atom marked true in the
requested row it will give one line of output
consisting of the residue and its type and its name, followed by
the number of hits in this residue, and the maximal number of hits
in this residue type. The latter is of course equal to the number of
atoms in this residue.
As there are only ten rows to work with, you might need to reset all rows in
order to create space to generate new, other rows. The command ROWINI causes
WHAT IF to irreversibly wipe out all previously generated rows, and all
information about them. Be aware that you do not have to empty a row before
you can write in it. After all rows have been filled, WHAT IF by default
overwrites the ten-th row, but you can by hand overwrite every row you want.
WHAT IF has two ways of backing up rows. The one way is row by row, the other
is all rows together. This feature allows CPU intensive search results to be
stored for future sessions. One should be aware however that very strange
things can happen if rows are restored at a moment that the soup contents
is different from the momemnt that the rows were backed up.
The command MAKROW will cause WHAT IF to prompt you for the number of a row,
and a file name. The row given will be saved in that file. You can later
retrieve the row with the GETROW command.
The command GETROW will cause WHAT IF to prompt you for the name of a row file.
This file must be created with the MAKROW command. It will read the file, and
store the row in the first available free row (or overwrite row 10 if no free
rows are available). Be aware that very strange things can happed in case
the row that you read does not belong with the present soup contents.
The command SAVROW will cause WHAT IF to prompt you for a file name. It will
then store all presently information about rows in this file. Use the command
RESROW to retrieve the data later.
The command RESROW will cause WHAT IF to prompt you for the name of a file
created with the SAVROW command. It will then initialize all information about
rows presently in memory, and read all information from this file. Be aware
the very strange things can happen if the soup is different now from the
moment that the file was written with the SAVROW command.