The command CHECK will bring you to the so-called CHECK menu. This menu holds
options that all check one or more aspects of protein
structures. Most checks determine exceptional situations, like for example
a contact that is seldomly seen in the database, but also hard errors, like
for example a wrong SCALE matrix in a PDB file can be detected.
Several of the commands in this menu are also executable from another menu.
For example CHICHK evaluates and checks torsion angles. This option can also be
called as EVACHI from the CHIANG (torsion angle) menu.
Several options in the check menu are so called 'terminal' options. That
means that they can destroy the status of the soup, and will definitely
leave WHAT IF in an undefined state after the option finished.
The command FULCHK will cause WHAT IF to write a complete report about
a protein structure. You will get the output in LaTeX format in a file
"pdbout.tex", and in plain text format in "pdbout.txt". Obviously
pictures can only be given in the LaTeX output. If you want to use the
LaTeX output, you will need the latex program and some others. For
your convenience suitable versions of these programs are archived on
our anonymous ftp site "swift.embl-heidelberg.de" in the directory
"/whatif/support".
To use the LaTeX output, you can type:
latex pdbout (to reformat the file)
xdvi pdbout (to preview the output)
dvips pdbout (to make postscript output)
lpr pdbout.ps (or a similar command, to print the postscript file)
A maximum of 100 lines will be given in any table. If more than 100
problems should be listed, the table is truncated at 50 lines, and the
total number of lines is written at the bottom. Since most tables are
sorted such that the worst numbers are at the top, this should not be
a problem. If you want to see the whole list anyway, you can get it by
running the individual check while creating a logfile (see DOLOG).
FULCHK is a terminal option. That means that you can not run FULCHK
just in the middle of a WHAT IF session. You run FULCHK on one
molecule, preferably in a "fresh" WHAT IF. After FULCHK finished, you
are immediately asked to terminate the session with FULLSTOP.
The FULCHK option writes a human readable text file, but also a TeX
style file with several kinds of graphics in it. If you want to get the
programs required for some of these plots, please see the chapter on
licenses.
The command FSTCHK does the same as the FULCHK option. However, rather
than running all checks, only a subset of all checks is executed. You
can control which options are skipped and which are executed with the
TODO.CHK file (of which there is an example in your dbdata directory
of the WHAT IF account). In this file the first three characters of
each line are the Check-Id, and columns 4-6 are either 'YES' or
'NO'. The rest of each line is free; in the example file you can find
out what the check does and how long it normally takes.
The command ACCCHK will calculate and evaluate accessible surfaces. It
will indicate whether the distribution of polar and apolar accessible
and buried atoms looks normal or not. At present I am not sure yet how
to interpret the numbers.... This option is not yet finished, and therefore
is not encorporated in the FULCHK report.
The command AXACHK will verify for each atom in the structure whether
it has a distance bigger than 0.7 Angstrom to all proper symmetry
axes. Any atom coming closer than this distance must form a "bump" to
a symmetry related copy of itself. The only exception is a water
molecule that is exactly on an axis; therefore WHAT IF will not
complain in such a case.
The command BMPCHK activates a bump check that is rather different from
the bump functions used by e.g. the DEBUMP option.
From a study of WHAT IF's database of high quality structures it was
determined that no pair of non-hydrogen-bonded atoms should have an
inter-atomic distance more than 0.4 Angstrom shorter than the sum of
the two Van der Waals radii. For hydrogen bonded atoms this limit was
found to be 0.55 Angstrom.
In the BMPCHK, all interatomic distances between non-bonded atoms are
calculated, and verified against these rules. If two atoms do come
closer, the amount by which the contact is too short is printed in a
table. In the table it will be indicated whether the bump is between
symmetry relatives (inter) or within the given asymmetric unit
(intra).
A bump will never be reported between two atoms for which the sum
of their atomic occupancies is less than 1.0
The command BNDCHK does not require any additional input. It will
perform a number of checks on the chemical bonds in the structure.
First it will check whether all atoms in all protein and nucleic acid
residues are present.
After that it will compare each bond in protein residues with the Engh
and Huber distance parameters [See Engh and Huber, Acta Cryst. A47,
392-400 (1991)] and print a table of all bonds that differ by more than
4 standard deviations from the expected values.
As a third check, the RMS deviation from the mean Engh and Huber
parameters is determined (expressed in standard deviations). This RMS
value is expected to be around 1.0. If it is bigger than 1.5 or smaller
than 0.666 WHAT IF will complain.
Lastly, BNDCHK will determine whether the deviation from the Engh and
Huber bondlengths is significantly correlated with the direction of
the bond in the crystallographic unit cell. If such a correlation is
found, a new unit cell is calculated where the correlation is gone.
If this message appears, the cell used during refinement probably
is not accurate enough. We do not have any experience on what to
do about it, though.....
The command BPOCHK will cause WHAT IF to list all buried unsatisfied
hydrogen bond donors or acceptors. This check uses a very
straightforward definition of a hydrogen bond. A more sophisticated
check of unsatisfied hydrogen bond potential is part of the HNQCHK.
The command CHICHK is equivalent to the EVACHI command in the CHIANG
menu.
All torsion angles in the molecule will be compared with the
distribution of the same torsion angle in 150 of the 300 best refined
proteins from the PDB. You will get a score for 'normality' and not
for 'correctness' or energetics. In this score 0.0 means that this
torsion angle value is as normal as it can be, and negative values
represent less common conformations. Residue values below -2.0 warrant
investigation, below -3.0 something strange must be happening.
For this analysis all torsion angles in the residue except omega are
used.
Another part of the CHICHK verifies the phi/psi combination versus a
Ramachandran plot. Residues that are in forbidden areas of the
Ramachandran plot will be listed. Also, a separate check on omega
values will be performed (for PRO and non-PRO residues), and residues with
unusual values are listed.
This check verifies the chain names in the PDB file. All residues with
a certain chain name should be consecutive in the file, otherwise an
error message will be given.
The command FLPCHK causes WHAT IF to compare all local backbone
conformations (5 residue stretches) with similar (RMSD on alpha
carbons less that 0.5 Angstrom) conformations in the database. The
RMSD of the backbone oxygen in the structure and the database
positions is given. If this value for a residue is above 1.5 manual
inspection of the peptide plane seems advisable. In brackets the
number of hits in the database is listed. This number should normally
be 80, as that is the maximal number of hits WHAT IF looks for. If
this number is considerably less than 80, the RMSD value for the
oxygen position becomes a less sensitive measure of quality.
The option H2OCHK will perform two checks on all water molecules
in the soup.
For all clusters of water molecules H2OCHK will verify whether they
are free-floating in the unit-cell, or touch the protein somewhere.
If a cluster is free-floating this is reported as a problem: it is
very unlikely that such clusters can be seen in the X-ray density, so
the listed water molecules are probably refinement artefacts.
For all water molecules the closest protein molecule is located. If
this is a molecule that is symmetry related to the ones given in the
input file, a warning is given. For optimum usability of the file the
listed waters should be moved such that they are closest to the
untransformed protein molecule. See the MOVWAT option for this.
The command HNDCHK can be used to check for wrong handedness of chiral
atoms in the twenty natural ocurring residues. All atoms with the
wrong chirality will be listed.
HNQCHK performs a set of commands from the HBONDS menu in a row, having to
do with the HB2 options. For this a complete calculation is done of the
optimal hydrogen bond network in the protein. A number of warnings can
be generated from the result.
The optimization of the hydrogen bond network considers two
possibilities for the side-chain conformations of HIS, ASN and GLN
residues. The X-ray experiment can not see the difference between the
two conformations. If the orientation of the side chain of one of
these residues in the optimized H-bond network is different from the
orientation in the input file, a warning is given.
If any buried hydrogen bond donors do not have an acceptor, they are
listed. In high resolution structures these do not occur, because it
is energetically highly unfavourable!
If any polar side chain acceptor does not accept a hydrogen bond, the
atom is listed.
From the optimized hydrogen bond network the protonation state of the
HIS residues (HISD, HISE or HISH) can be deduced. Also, from the
geometry of the HIS ring it is often possible to see which Engh and
Huber parameters have been used for refinement. All these assignments
are printed in a table. If the two assignments for a residue differ it
is good to verify whether the correct parameters have been used for
the refinement.
The command NAMCHK alolows you to check the names of atoms. All atoms
with non-IUPAC names will be listed. This involves simple torsion angle
calculations (like for the PHE side chain) as well as checks for the
exchange of atoms (like CG and OG in the THR side chain).
Most checking options write a summary in a file that can be inspected
by for example a simple perl-script like used in our WWW version of
the CHECK procedures. The file is called 'check.db'.
WHAT IF keeps adding its results to the
end of this file. The command NEWCHK closes the old copy of this file
if it exists. It also closes any TEX files that were made already.
If you want to keep those files you should rename them BEFORE you run
any other check option, because the check options will not even
hesitate for a millisecond, and overwrite the old files.
The planarity of side chains of protein residues is verified against a
database distribution. If any side chain deviates more than 3.0
standard deviations from planarity, this fact is reported.
For each atom connected to an aromatic ring system the distance of the
atom to the least squares plane of the ring is calculated, and
compared with a database distribution. If any value deviates more than
3.0 standard deviations from the plane, this fact is reported.
The command QUACHK is similar to the RNGQUA option in the QUALITY
menu. It activates the packing quality control. See the chapter on
QUALITY control for an explanation. For short:
Every residue with a quality value below -5.0 is suspicious. A sequence
of residues with low quality scores is "interesting".
Every molecule with a global quality below -2.7 is guaranteed wrong. A
molecule with a quality below -2.0 might be misfolded or poorly
refined. Every molecule with a global quality below -1.2 does not
belong in a database of reliable structures.
The command ROTCHK will compare for all residues their chi-1 rotamer
with the distribution of observed rotamers for the same residue type
in a similar local backbone conformation in the database. A normality
index will be listed. If this index is lower than 0.5 a warning will
be given. A few values are expected to appear for every structure, but
normality values lower than 0.2 should occur only extremely sparingly!
The command SYMCHK is a killer command. That means that it starts by
wiping out the soup. It will then prompt you for the name of the PDB
file for which the symmetry information should be checked. This
file will be read and checked.
This option checks the internal consistency of the SCALE and CRYST
card in the PDB file, and it checks if the crystal can be
reconstructed from the atomic coordinates and the provided symmetry
information. It also checks whether the cell complies with rules set
by the IUCr, and whether there is extra symmetry between so-called
independent molecules.
WGTCHK checks whether all atomic occupancies are between 0 and 1.
XBFCHK verifies the B-factors in the structure. If many buried atoms
have a B factor below 5.0, a warning is given. This either means that
the structure has been determined at low temperature, or that there
are problems in the refinement. If the average B factor for buried
atoms is very high or very low, another warning is given. Finally, the
distribution of B factors (basically the differences between B factors
of bonded atoms) is analyzed. If the result is very strange, a warning
is printed. If this warning appears, the B-factors should probably be
constrained during the refinement. Because these strange observed
differences can not be caused by thermal motion, adding constraints
could improve the behaviour of the refinement.