THE SHELX HOMEPAGE
The current release of the complete SHELX system is 97-2 (24 March 1998)
1. What is SHELX ?
2. How to obtain SHELX.
3. Application form.
4. Changes between releases 97-1 and 97-2.
5. List of files on ftp site and CDROM.
6. How to install SHELX-97.
7. Frequently asked questions ; for further information Thomas Schneider's FAQs
8. Important changes since SHELX-93 and SHELXS-86.
9. Support and bug reporting.
10. Sheldrick group homepage.
All for-profit and academic users of release 97-1 may update to 97-2 free of charge. For-profit users who requested the programs on CDROM and have already paid the license fee will be sent an update onCDROM.
Note that the file shelxman.htm, which will provide on-line help for using the programs, is NOT READY YET! It will be announced here when it is ready.
What is SHELX ?
SHELX is a set of programs for crystal structure determination from single-crystal diffraction data. The first version of SHELX was written at the end of the 1960's. The gradual emergence of a relatively portable FORTRAN subset enabled it to be distributed (in compressed form including test data as one box of punched cards) in 1976. SHELX-76 survived unchanged - the extremely compact globally optimized code proved resistant to mutations - until major advances in direct methods theory made an update of the structure solution part necessary (SHELXS-86). Rewriting and validating the least-squares refinement part proved more difficult, but was finally achieved with the release of SHELXL-93. During this time operating systems such as RDOS and VMS, under which FORTRAN and SHELX ran, rose and fell. SHELXS-86 and SHELXL-93 were essentially upwards compatible with SHELX-76, for example the format of the reflection file remained unchanged (Microsoft please note). These programs are used in well over 50% of small-molecule structure determinations. Although SHELX was originally intended only for small molecules, improvements in computing performance and data collection methods have led to increased use of SHELX for macromolecules, especially the location of heavy atoms from isomorphous and anomalous difference data using SHELXS, and the refinement of proteins against high-resolution data (2.5A or better) using SHELXL.
A further release of SHELX in the current millenium was never intended, but the increased (mis)use of the programs by macromolecular crystallographers, and changes in the CIF format for data deposition, have unfortunately (?) made it necessary to release a new version of the complete package as SHELX-97. This also provided the opportunity to update the structure solution algorithms and to add a new interface program for protein users, as well as fixing bugs and incorporating many suggestions by users. The new release should be also appreciably easier to install and use. A commercial version (SHELXTL) that incorporates all of SHELX-97 plus extensive interactive graphics is available from Bruker-AXS (nee Siemens).
SHELX-97 consists of the following programs:
SHELXS - Structure solution by Patterson and direct
SHELXL - Structure refinement (the version SHELXH is for large structures)
CIFTAB - Tables for publication via (small molecule) CIF format
SHELXA - Post-absorption corrections (for emergency use only)
SHELXPRO - Protein interface to SHELX
SHELXWAT - Automatic water divining for macromolecules
Back to main menu
2. How to obtain SHELX
SHELX-97 is currently available by ftp transfer and on CDROM. It is free to academics (small donations are however not refused) and for a license fee of US$2499 to for-profit organizations. One license covers the use of all the programs for an unlimited time on an unlimited number of computers at one geographical location. This license income is essential for supporting the distribution and further development of the programs; we do not make a profit. A two-month free trial is available to for-profit users; at the end of that time they should either pay the license fee or send a signed declaration on company notepaper that they have destroyed all their copies of SHELX. Academic users who request the programs on CDROM are expected to contribute US$99; this may be waived for poorer countries without adequate ftp access. Applications should be made by post or by fax using the following application form only. CDROMs will be dispatched by normal post and may be subject to delay if there is a flood of applications; users intending to obtain the programs by ftp will receive the necessary instructions by email or fax. As part of the license agreement, users are expected to cite SHELX-97 in any publication in which it proved useful. It is understood that the author has no liability for any damage or loss caused by the programs; they may prove addictive!
Potential users of the Bruker SHELXTL version should contact Susan Byram (fax: +1(608)276-3006; email sbyram@Bruker-axs.com) at Bruker AXS in Madison, Wisconsin, USA or Eric Hovestreydt (fax: +49(721)595-4506) at Bruker AXS in Karlsruhe, Germany.
Back to main menu
4. Changes between releases 97-1 and 97-2
1. A large number of minor bugs have been fixed. Several of them involved twinning and the application of restraints to complicated disorder problems.
2. Several errors (e.g. in the calculated data completeness) have been fixed in the CIF and PDB deposition procedures, and both brought up-to-date to January 1998 (as far as it is possible to hit moving targets!). SHELXPRO now calculates the Matthews coefficient and solvent content for PDB deposition.
3. In response to requests by three users, the maximum number of atoms that may be referenced on a single SHELXL instruction has been increased from 9998 to 63998.
4. The LIST 7 option has been added in SHELXL to output h,k,l,I,sigma(I) and the scaled calculated contributions to the intensities from individual twin components (-1 if absent).
5. Format files ciftab.rta and ciftab.rtm have been added to enable CIFTAB to generate Rich Text Format tables suitable for input into any version of MSWORD (and other text processors). I am grateful to Alex Yokochi for helping me with this.
6. The "I" and "U" options in SHELXPRO now understand ANISOU records, and "U" can read the atoms from an (XtalView) PDB file and the rest from the current .res file in order to generate a new .ins file.
7. Potential waters may be selected by SHELXWAT and the "U" option in SHELXPRO on the basis of their heights in the difference map in sigma units; a threshold in the range 4 to 5 is recommended. This avoids excessive dilution.
8. The first two ISOR parameters now have the same defaults (this maintains compatibility despite a bug-fix!).
9. The .res file is now output with more space for free-variable references and more decimal places, the latter avoids possible rounding errors when the cell dimensions are greater than about 200 Angstroms.
10. The new "J" option in SHELXPRO provides for the automatic generation of DFIX and DANG restraints from structures in CSD, PDB or SHELX format.
11. The new "Y" option in SHELXPRO converts various formats of reflection data file written by X-PLOR or CNS into a SHELX format .hkl file with retention of the free R flags.
12. The "M" option in SHELXPRO can now create maps for both "little-endian" and "big-endian" target machines running the program O.
13. The Linux executables were compiled using the Portland Group compiler which produces excellent code and can be strongly recommended (but costs real money). They are faster than the previous g77 versions and numerically more reliable than executables produced with the Absoft FORTRAN compiler.
14. The MSDOS executables were produced using version 4.0 of Lahey FORTRAN-90; they are faster than the previous versions and should run correctly in a DOS window under Windows 95 or NT. I am grateful to Alex Sobolev for suggesting this compiler.
15. Files index.doc and index.ps have been added to the doc and ps directories respectively, but otherwise the documentation hasn't been changed in this release.
Back to main menu
5. List of files on ftp site and CDROM
The ftp site and CDROM contain the following files and subdirectories:
'shelx.htm' and 'shelxman.htm'(NOT YET READY!)` - On-line help in HTML format: requires a browser such as Netscape. shelx.htm contains the current SHELX homepage, shelxman.htm includes summaries of commands etc. For network use the file extensions will normally have to be changed to .html, but three letter extensions are used for the distribution for compatibility with MSDOS.
'applfrm.htm' - application form in HTML format, called by shelx.htm. Will probably also need renaming to .html.
Subdirectory 'unix' contains the sources of all programs for relatively standard UNIX systems. These should also compile successfully on many other operating sytems too (except VMS).
Subdirectory 'vms' contains the VMS sources for Digital computers.
Subdirectory 'doc' cotains the full manual in WINWORD 6 format, one file per chapter. It is designed to print on letter sized paper.
Subdirectory 'ps' cotains the full manual in Postscript format, one file per chapter. It is also designed to print on letter sized paper.
Subdirectory 'egs' contains the test jobs and other examples files. These are in MSDOS text format, because all SHELX programs can read this format, even if they are running under UNIX, but the MSDOS versions cannot read UNIX text format.
Subdirectory 'ibm' contains the IBM RS6000 executables (these also execute on the IBM Power-PC series).
Subdirectory 'sgi' contains the SGI IRIX executables; they should run under IRIX 5.3 or later with the R4000 series processors. For other systems it is desirable to recompile to obtain programs that execute faster even if the precompiled versions run correctly. The executables shelxl64, shelxh64 and shelxs64 were compiled for 64-bit operation on an R10000 with IRIX 6.3 or later, but have not been extensively tested.
Subdirectory 'linux' contains the LINUX executables for Intel processors.
Subdirectory 'dos' contains the MSDOS executables; they should now run in a DOS window under Windows 95 or NT.
In addition, the ftp login directory contains gzip compressed tar files of the above subdirectories: doc.tgz [296266 bytes], dos.tgz [1879574 bytes], egs.tgz [203677 bytes], ibm.tgz [1064440 bytes], linux.tgz [1084059 bytes], ps.tgz [708846 bytes], sgi.tgz [2035985 bytes], unix.tgz [441512 bytes] and vms.tgz [441225 bytes]. These are convenient for down-loading with ftp as shown in the next section; the number of bytes should be checked for complete transfer.
Back to main menu
6. How to Install SHELX-97
In many cases it will be possible to use the precompiled versions provided. The executable programs (and the file ciftab.def) should simply be copied from the appropriate directory on the CDROM or ftp site to a directory on your machine. This directory should be specified in the 'PATH' so that the executables can be found. On UNIX systems the lazy way is to copy the programs into /usr/bin; on MSDOS systems they are usually copied to C:\EXE and this directory name is then added to the PATH specified in AUTOEXEC.BAT. You may also wish to copy the documentation and examples files. Eg. for a PC running Linux the following files should be fetched to your working directory by ftp (binary transfer !); for most other UNIX systems the installation procedure is similar:
linux.tgz, ps.tgz, egs.tgz, shelx.htm and shelxman.htm (not yet available !)
The three compressed tar files can then be unzipped and extracted:
tar -xvf ps.tar
tar -xvf linux.tar
tar -xvf egs.tar
which will create the subdirectories linux, ps and egs. The executables can be copied to /urs/local/bin or to /usr/bin (needs system manager priviledges !):
cp linux/* /usr/local/bin
Under LINUX it is particularly easy to print the documentation, because lpr can recognize and print Postscript even on a non-Postscript printer:
The on-line help files shelxl.htm and shelxman.htm should be copied to a generally accessible directory; theymay be viewed with Netscape or any other HTML browser. shelxman.htm is called by shelx.htm; separate files are provided for ease of printing. These files are NOT copyrighted and may be shown freely to others for non-commercial purposes. The full documention is available in WINWORD 6 format in subdirectory 'doc' and in Postscript form in subdirectory 'ps'.
Program compilation under UNIX (and other operating systems)
The UNIX version has been designed to be easy to compile on a wide range of UNIX (and other) systems. The resulting compiled programs do not need any environment variables or hidden files to run; it is simply necessary that the executable program is accessible via the PATH or an alias. The lazy way is to copy the executables into /usr/bin.
The Linux executable of SHELXL was compiled using version 1.7 of the Portland Group FORTRAN compiler as follows:
pgf77 -O2 -Mnoframe -Munroll shelxl.f shelxlv.f -o shelxl
The compilation for other UNIX systems should be similar (usually "pgf77" is replaced by "f77"). IT IS NECESSARY TO BE VERY CAREFUL ABOUT OPTIMIZATION. The safest is to compile without any optimization first (.g. -O0 rather than -O3), run the ags4, sigi and 6rxn tests, and rename the resulting output files *.res, *.lst, *.fcf and *.pdb. Then recompile with highest optimization (e.g. -O3), rerun the tests, and use the UNIX diff instruction to compare the results with those from the unoptimized version. Small differences in the last decimal place do not matter, and of course the CPU times will differ, but if there are significant differences then the optimization level should be lowered and the tests repeated. For some systems (including certain SG Challenge and Digital Alpha systems), only the shelxlv.f file (containing the rate-determining routines) can be compiled with the highest optimization level; shelxl.f must be compiled at a lower level.
The shelxl.f source calls the following routines that may be different or not available for some FORTRAN compilers:
IARGC and GETARG: these have always worked so far; if necessary the standard C routines could be adapted since the specifications are the same.
EXIT and FLUSH: if these cause problems they can safely be commented out in the source or replaced by the dummy FORTRAN subroutines provided. EXIT is used in 2 places to tidy up before terminating, FLUSH(6) occurs once to flush the logfile so that a batch job can be watched as it runs (eg. with tail -f).
ETIME and FDATE: most UNIX FORTRAN compilers will recognize these routines. For compilers that do not, both FORTRAN and C substitutes are provided. Usually at least one substitute will work, but the following points should be checked carefully: Some FORTRAN compilers add an underscore to the end of procedure names before searching them in a library (this avoids confusion with standard C routines that happen to have the same names). The C versions are provided both with underscores (files fdate_.c and etime_.c) and without (fdate.c and etime.c). The FORTRAN substitute for FDATE (fdate.f) calls FORTRAN routines TIME and DATE. Some compilers link in the C procedure 'time' instead, with strange results because the parameters may be different. The alternative fdate.c is safer. The C replacement for ETIME (etime.c) may suffer from time 'wrap-around' if a large value for CLOCKS_PER_SEC (say 1000000) is combined with the use of a 32-bit or shorter integer to pass the time (!). Check the type time_t and CLOCKS_PER_SEC in /usr/lib/sys/time.h (you may need to consult a Guru).
The IBM RS6000 executable was compiled as follows; note that fdate.f
cannot be used for the reason given above, and that the underscore is not
needed after fdate in the C subroutine. The FLUSH routine was replaced
by the dummy.
xlf shelxl.f -O -c
xlf shelxlv.f -O -c
xlf etime.f -c
xlf flush.f -c
xlc fdate.c -c
xlf shelxl.o shelxlv.o fdate.o etime.o flush.o -o shelxl
SHELXS uses the same routines and should be compiled just like SHELXL; the same applies to SHELXH, the large version of SHELXL, which uses shelxh.f and shelxlv.f etc. The rate-determining routines for SHELXS are in shelxsv.f, the rest in shelxs.f.
One commented line near the start of SHELXL, SHELXH and SHELXS needs to be changed if these programs should write MSDOS format ASCII text files rather than UNIX format when run on a UNIX system. This is useful for a heterogeneous UNIX/MSDOS network, because the UNIX versions of all SHELX programs can read MSDOS format files. but not vice versa.
The remaining programs do not require optimization (except possibly SHELXA and SHELXPRO) and do not require FDATE, ETIME, FLUSH and EXIT, so they are easier to compile. For example under IRIX 5.3:
f77 shelxpro.f -o shelxpro
f77 shelxwat.f -o shelxwat
f77 ciftab.f -o ciftab
f77 shelxa.f-o shelxa
For optimum performance on the SGI R10000 all the programs should be recompiled with -O3 -64 -mips4 to produce 64-bit code (but check carefully for incorrect numerical results from optimizing errors!). The executables supplied should run under IRIX 5.3 through 6.3 inclusive provided that all the necessary SGI patches have been made to the operating system, but they will need recompiling for other versions of IRIX.
Unlike SHELXL and SHELXS, there are some intentional deviations from the strict FORTRAN-77 standard in these programs. REAL*8 and list-directed reading of internal files are used in several cases, and SHELXPRO uses types INTERGER*2 and BYTE in order to produce binary map files for O. Most FORTRAN compilers have no problems with these extensions, but may output warning messages.
Note that CIFTAB will search the current directory for a specified format file, and if it doesn't find it there it will look for it it a directory that is defined in the source. Unless this is edited before compiling, the directory is set to /usr/bin, so if the executable programs are located in /usr/bin the file ciftab.def (the default format file) should be there too.
Program compilation under VMS
The following instructions may be tried for compilation of the VMS sources under OpenVMS:
It may be necessary to split up the programs into subroutines to prevent the compiler running out of virtual memory. The files produced by the test jobs for SHELXL and SHELXS MUST be compared with those obtained using unoptimized versions of these programs (compiled with /noopt instead of /opt; note that /opt is usually the default) since optimizing errors are common for Digital compilers; there is a DIFF instruction in VMS that can be used for this. The remaining programs are not very CPU-intensive and so should not be optimized. If optimization causes errors, it is worth trying just to optimize shelxsv.f and shelxlv.f (which contain the rate determining routines) but not the rest. The executables need to be defined as follows:
shelxs :== $ disk:[directory]shelxs etc.
where 'disk' and 'directory' should be replaced by the appropraite local names and the programs are run (after preparing the files name.ins and name.hkl) by e.g.
SHELXWAT and SHELXA accept UNIX-type switches (even under VMS); they MUST come before the filename, e.g.
shelxwat -h -w4.5 name
No other files or parameter settings are required to run the programs, except that the files ciftab.def, ciftab.rta, ciftab.rtm or a user-produced format definition file should be in the current directory when CIFTAB is run; if this file cannot be found in the current diectory, CIFTAB searches for it in a directory specified in the source.
Parallel and vector machines
SHELXL and SHELXS are designed to run very efficiently on vector computers (such as older Cray and Convex machines); no changes should be needed to the code. Unfortunately the crystallographic algorithms involved are less suitable for parallel computers (or multiprocessor systems); in such cases the avaliable computer resources are more efficiently used by running several jobs simultaneously, one per processor.
SHELXH - version of SHELXL for very large structures
SHELXH is a special version of SHELXL for the refinement of very large structures (with more than about 10000 unique atoms). The only difference between shelxh.f and shelxl.f is the first FORTRAN statement in which the array dimensions are specified by means of a PARAMETER statement; shelxh was compiled (using shelxlv.f etc.) exactly as described above for shelxl. Large versions of shelxs, shelxpro and shelxa may be created in the same way, but it is rather unlikely that they will ever be required. Further details are provided by comments in the respective sources.
SHELXL will print a suitable error message if it is necessary to increase the dimensions of the large arrays A or B. An additional warning sign is the 'maximum vector length' printed in the .lst file at the beginning of each refinement cycle; if it is too small (say less than 32) the program will still run, but with reduced efficiency. This applies to all computers but is especially serious on a vectorizing computer such as an older Cray or Convex.
A little care and fine-tuning is required so that such large structures can be refined efficiently. If the computer does not have enough physical memory available, or if the 'maximum vector length' is set too large, shelxh will run in disk exercising mode. This 'maximum vector length' refers to the number of reflections that are processed in one vector run, which may be smaller than the number in the input/output buffer. Some trial and error is needed to set the maximum allowed value so that the physical memory is fully exploited with a minimum of disk I/O for the virtual memory swap file. This number is set as the fourth parameter on the L.S. or CGLS instruction, and should be a multiple of 8; a good value to try for a 64MB computer is 64 (the third number on the L.S. or CGLS instruction is almost always zero). The array B is used as working space for these vectors (CGLS and L.S.) as well as for the least-squares matrix (L.S.). If the array B is not big enough, the program will use a smaller maximum vector run.
Back to main menu
7. Frequently asked questions
Q1: Please send me a copy of SHELX-76. I am afraid that I cannot use the new version because my diffractometer measures F-values, not intensities.
A: Buy a CCD detector. They measure intensities! In fact, diffractometers
measure intensities too. You just need the right data reduction program.
If you are desperate you can even feed SHELXL with F-values using HKLF
Q2: When I start SHELXL on my PC the disk rattles loudly for several hours and smoke comes out of the back. Is this a bug?
A: You must be trying to run SHELX under some version of WINDOWS! The
best solution is to reformat the hard disk and install LINUX. However the
release 97-2 should produce less smoke than 97-1.
Q3: The referee rejected my paper because the weighted R-factor was too high and because the stupid program had forgotten to fix the y coordinate of one atom to fix the origin in space group P21. What should I do?
A: Try another journal; if you emphasize the 'biological relevance'
enough, they may not notice the R-factor! Note that wR2 (based on intensities
and all data) is of necessity 2 to 3 times higher than wR1 (based on F
and leaving out reflections with say F<4sigma. Unfortunately SHELXL
cannot work out wR1, because the weighting scheme for intensities does
not apply to F-values. It is better to quote the unweighted R1 (with or
without a 4sigma threshold) anyway, because it is too easy to cheat on
wR2 by modifying the weights! It is no longer necessary or desirable to
fix the origin by fixing coordinates, the program applies appropriate floating
origin restraints automatically when they are needed.
Q4: The program tells me to refine extinction, this does reduce the R-factor but the extinction parameter becomes very large although my crystal could hardly be described as 'perfect'. Is this reasonable?
A: No. The most likely causes of large apparent extinction are: (a)
you have input F with HKLF 4, (b) A few reflections that should be very
strong have been measured as weak because they were cut off by the beam-stop,
(c) your counter was saturating and an inadequate dead-time correction
was made (in the case of an image plate this is an 'overload'), or (d)
your counter was defective or the energy discrimination was set wrongly.
Overloads may be eliminated by 'OMIT h k l' if necessary.
Q5: The structure could only be solved in P1, not P-1, but on refinement some of the bond lengths and U-values are wildly different in the two molecules. If I use SAME the geometries of the two molecules become very similar but how do I restrain the Uij components of equivalent atoms to be the same?
A: You could use EADP, but it might be better to look for the inversion center instead, otherwise you will probably be 'marshed'.
Q6: I included batch numbers in the .hkl file and BASF parameters in the .ins file, but the stupid program still didn't refine the batch scale factors!?
A: You need MERG 0 (the default MERG 2 will average the batch numbers).
Q7: How do I obtain the molecular replacement program PATSEE?
A: PATSEE has been maintained by its author, Ernst Egert, since he moved from Goettingen to the University of Frankfurt. He can be contacted by fax (+49-69-7982-9128) or email (firstname.lastname@example.org).
Q8: What should I do about 'may be split' warnings?
A: Probably nothing. The program prints out this warning whenever it
might be possible to interpret the anisotropic displacement of an atom
in terms of two discrete sites. Such atoms should be checked (e.g. with
the help of an ORTEP plot) but in many cases the single-site anisotropic
description is still eminently suitable.
Q9: I get the message ' ** UNSET FREE VARIABLE FOR ATOM ... **' but I haven't used any 'free variables'!?
A: There is a typo in your atom coordinates, e.g. a decimal point missing or replaced by a comma. Alternatively you may have really referenced a free variable that wasn't defined by FVAR!
Q10: After using SHELXPRO to prepare the .ins file from a PDB file and then running SHELXL, I get the message: ' ** No match for 2 atoms in DFIX ** ' !?
A: This message probably refers to the fact that SHELXPRO labels the
oxygens of the carboxy-terminus OT1 and OT2 so that special restraints
can be applied, so there is no atom called 'O' in this residue. This is
normal and can be safely ignored. Other similar messages, also messages
about bad CHIV or AFIX connectivity, should be investigated (by checking
the extra information, including the connectivity table, given in the .lst
file) to see if they can be ignored safely or not. If the initial geometry
is poor, it may be necessary to edit the automatically generated connectivity
table with BIND and FREE.
Q11: The program prints out a Flack x parameter of 0.3 with an esd of 0.05. Is the crystal racemically twinned?
A: Not necessarily! The Flack parameter estimated by the program in the final structure factor calculation ignores correlations with all other parameters (except the overall scale factor). Since these parameters may have refined so as best to fit a wrong absolute structure, it is quite possible to get an estimate of about 0.3 for the Flack parameter when the true value is 1, i.e. the structure needs to be inverted and is not racemically twinned. On the other hand a value close to zero with a small esd is a strong indication that the absolute structure is correct. If there is any doubt the Flack parameter should be refined together with all the other parameters using TWIN and BASF.
Q12: Neither direct methods nor Patterson interpretation in SHELXS can find the 24 selenium atoms from MAD data of my selenomethionine labeled protein.
A: I'm not surprised.
Q13: How do I move a refinement from X-PLOR to SHELXL?
A: Use the "Y" option in SHELXPRO to convert the .fob reflection
data to SHELX .hkl format, keeping the free R flags. Then use the "I"
option to convert the PDB file to SHELX .ins format. That's all (but see
Q14: How does one set up restraints for a non-standard residue for SHELXL?
A: First find a suitable fragment in a database such as the CSD, then
use the "J" option in SHELXPRO. FLAT and (zero chiral volume)
CHIV restraints can easily be added by hand. If the structure contains
a number of identical units such as sulfate ions, SADI or SAME can be used
instead, then it is not necessary to invent any target values.
Q15: How do I know where to look for 'interesting' features in my maps after refining a protein with SHELXL?
A: SHELXL and SHELXPRO provide a great deal of diagnostic information. There is a big table near the end of the .lst file summarizing features of the difference map, mean displacement parameters, restraint violations etc. on a residue by residue baisis near the end of the .lst file. Also check the .lst file for any restraint violations. The "T" and "R" options in SHELXPRO should be run after every job, if there is NCS (restrained or not) the "N" and "K" options are very useful. Remember that these options all write Postscript plots to the .ps file, but you cannot look at that until you quit SHELXPRO. A combination of 2mFo-DFc (Sigma-A) and Fo-Fc maps (contoured at positive and negative levels) is usually most useful in O or XtalView. Every few jobs you should also run PROCHECK and/or WHAT-CHECK on the .pdb file created by SHELXL too (for WHAT-CHECK you will need to edit the name of the space group into the .pdb file.
Q16: What is the worst resolution that is acceptable for: (a) solution of a structure by direct methods using SHELXS, (b) refinement with SHELXL?
A: Direct methods assume randomly distributed resolved atoms. Direct
methods are crucially dependent on having atomic resolution data, say better
than 1.2A. A good rule of thumb is that a least one half of the theoretically
possible number of reflection s between 1.1 and 1.2A should have been measured
with I>2sigma for direct methods to be successful, though this rule
can be relaxed somewhat for centrosymmetric structures and structures containing
heavier atoms. In particular the resolution is not so critical for the
location of heavy atoms from delta=F data, provided that the minimum distance
betwen heavy atoms is much greater than the resolution. SHELXL lacks the
energy terms used by e.g. X-PLOR for refinement against low-resolution
data. This imposes an effective limit of about 2.5A for SHELXL refinement,
but this limit may be extended a little to lower resolution if NCS restraints
can be used.
Q17: I have a lot more questions to ask ...
A: Look at Thomas Schneider's FAQ's first!
Back to main menu
8. Important changes since SHELX-93 and SHELXS-86
1. The new programs SHELXPRO (interface for protein users), SHELXWAT (automated water divining for macromolecules) and SHELXA (post-absorption corrections) have been added. The full documentation should be consulted for details. In particular, SHELXPRO includes extensive facilities for data input, analysing refinement results, PDB file manipulation and deposition, and preparation of maps for graphics programs such as O. SHELXPRO replaces the program PDBINS supplied with SHELXL-93, and the dictionary file SHELXL.DIC that PDBINS used is no longer required (an improved dictionary is stored internally in SHELXPRO).
The program SHELXA has been kindly donated to the system by an anonymous user. It applies "absorption corrections" by fitting the observed to the calculated intensities as in the program DIFABS. SHELXA is intended for EMERGENCY USE ONLY, eg. when the world's only crystal falls off the diffractometer before there is time to make proper absorption corrections by indexing crystal faces or by determining an absorption surface experimentally by measuring equivalent reflections at different azimuthal angles etc. Under no circumstances should the results be published; the anonymous donor does not wish to be cited in this non-existent publication because it might ruin his reputation!
A simple program SHELXWAT has been added that iteratively recycles SHELXL to provide automatic water divining. This may be regarded as a cheap and inadequate imitation of the ARP method [V.Lamzin & K.S.Wilson, Acta Cryst. D49 (1993) 129-147], but is relatively easy to use and useful if you intend to take a holiday.
A large version of SHELXL called SHELXH has been added for the refinement of large structures (on fast computers with a lot of RAM).
2. The new version of SHELXS includes the 'phase annealing' direct methods described in Acta Cryst. A46 (1990) 467-473, and the Patterson interpretation method used in Acta Cryst. D49 (1993) 18-23. The latter represents my best method so far for finding heavy atoms from macromolecular SIR and OAS data, although it was originally written for small molecules. Examples are provided for small molecules (log, cumos2) and protein SIR data (barnase). The location of more than ca. 10 selenium atoms from noisy MAD data remains a difficult problem that I am still working on!
3. CIFTAB and the preparation of small-molecule CIF files have been revised so as to be fully compatible with the 1998 Instructions for Authors for Acta Crystallographica; a number of new items have been added to the output file from SHELXL and to the template format file CIFTAB.DEF. SHELXL now rounds esds written to the .cif file according to the 'rule of 19'. The recommended way of depositing macromolecular data is still PDB format (via SHELXPRO that sets up a SHELXL template) for the coordinates and refinement details, plus the CIF format .fcf output file written by SHELXL for the reflection data (approved by the PDB). Note that the specification of the ACTA instruction has changed!
4. After intensive debate by the IUCr COMCIFS committee, the action of the OMIT and SHEL instructions in SHELXL has been altered. A 2-theta limit or OMIT h k l instruction now causes the reflections in question to be rejected entirely, i.e. treated in the same way as systematic absences. These data do not contribute to any of the numbers printed out by the program except the number of reflections read from the .hkl file (however serious systematic absence violations are always reported before the reflection is rejected, because these may indicate that the space group is wrong).
A positive first OMIT parameter has the same meaning as before; the reflection is retained but not used for refinement, and this option is still inconsistent with the ACTA instruction. Such reflections are included in the R-values etc. for 'all data' but not in the 'F>4sigma(F)' R-values. Both they and the R(free) reflections are marked with an asterisk in the table of the 'most disagreeable reflections'.
A negative or zero first OMIT parameter s does not cause any reflections to be supressed, but intensities more negative that 0.5s are reset to 0.5s. The default value of s=-2 is believed to be sufficiently negative to avoid statistical bias, whilst still reducing the influence of erroneous outliers.
Side-effects of these changes are that refinements on restricted data shells (e.g. rigid group refinements with fixed displacement parameters [BLOC 1] in the initial stages of macromolecular refinement) will be faster, and that R-factors and goodness-of-fit will sometimes be slightly lower (experience indicates that this does not usually lead to complaints).
5. 'Government health warnings' have been added to the recommended weights, the Flack parameter and the bond lengths to hydrogen when these are output to the .lst file by SHELXL.
6. If a SIZE instruction is included in the .ins file, SHELXL uses it and the calculated absorption coefficient to estimate the minimum and maximum transmission coefficients, which are written to the .CIF output file. These estimates are probably closer to the true absorption by the crystal than the values obtained from empirical absorption correction programs based on azimuthal scans or equivalent reflections, since such programs also (very conveniently) correct for other systematic errors as well (e.g. absorption by a glass fibre or capillary) and only determine relative transmission.
7. The recommended method for specifying which reflections should be used for R-free is to flag them with a negative batch-number' in the .hkl file. This may be set to be consistent with the selection of reflections for preliminary refinement with other programs such as X-PLOR, or may be set at random or in thin shells by SHELXPRO. If the second number on an L.S. or CGLS instruction is -1, reflections read in to SHELXL with HKLF 3 or HKLF 4 are reserved for R(free) if their batch numbers are negative. These reflections are treated normally (i.e. the sign of the batch number is ignored) if this parameter is set to any other value. Note that -1 is used because it would not be sensible to select all the data for the R(free) test if the selection is performed by SHELXL (-n would select every nth reflection - this has been retained for upwards compatibility with SHELXL-93 but is not recommended for new structures). Any MERG value may be used with the "-1" option, but a merged reflection is only used for R(free) if ALL the contributors had negative batch numbers.
8. An instruction HTAB has been added to SHELXL. It prints an analysis of the hydrogen bonding geometry for all polar hydrogens, and reports bad H...H contacts (e.g. where the program has assigned two hydrogens to the same hydrogen bond when automatically generating them). The algorithm for the generation of polar hydgrogens has been improved to avoid such clashes.
Note that HTAB only finds the 'right' hydrogen bonds if the hydrogens are approximately correctly placed! The program suggests alternative hydrogen positions if two hydrogens have been assigned to the same hydrogen bond, or if an O-H hydrogen makes no hydrogen bonds. HTAB followed by the names of the donor and acceptor atoms (the latter may involve a symmetry operation) generates a table with esds (except for riding hydrogens) and the new CIF output for hydrogen-bonds.
9. A BLOC instruction with no parameters now fixes all atomic parameters (xyz, sof and Uij) in a SHELXL refinement. Such a BLOC instruction takes priority over all other BLOC instructions, irrespective of the order in which they are given.
10. The new instruction STIR ('stepwise improvement of resolution') has been added to SHELXL. It takes two parameters: the starting resolution and the increment in resolution. Thus: STIR 4 0.01 (or STIR 4 since 0.01 is the default for the second parameter) causes a limiting resolution of 4.00A to be used on the first refinement cycle, 3.99 on the second, 3.98 on the third and so on. The result is a gradual increase in the number of reflections included in the refinement until the limit of the data or SHEL instruction is reached. The low resolution limit is specified by the SHEL instruction. By starting at lower resolution and gradually improving it in this way the radius of convergence for models with positional errors is increased. This may be regarded as a primitive form of 'simulated annealing' for use in the early stages of macromolecular refinement, e.g. for molecular replacement solutions.
11. The extra FMAP options FMAP 5 (Sim weighted difference electron density) and FMAP 6 (Sim weighted difference electron density with sharpening) have been added to SHELXL. Other FMAP parameters remain unchanged. At least in theory, these may make it easier to locate missing atoms when the structure is being phased by a relatively small percentage of the total scattering power, e.g. by one or two heavier atoms. FMAP 5 and 6 assume that the unit-cell contents given on the SFAC and UNIT instructions are approximately correct.
12. SHELXL now recognizes 'D' as an element name on an SFAC instruction. Except for calculating the density, this is treated as 'H'. The correct cell contents including deuterium should be specified on the UNIT instruction, but all H and D atoms must refer to the scattering factor number of H, even if they are all deuterium. This enables the density to be calculated correctly (and written to the CIF file) for samples that have crystallized out of deuterochloroform or C6D6 in NMR tubes !
13. The CGLS algorithm in SHELXL has been improved; the program learns
the optimum shift factor to apply from the behavior in previous refinement
cycles. SLIM is thus no longer required, but the first DAMP parameter may
be used if necessary to set the shift factor for the first cycle (since
the program has not then had time to learn it). It follows that DAMP now
has a different meaning (and quite different parameter values) for L.S.
and CGLS. In general, CGLS should now always converge optimally without
the need for any hand tuning.
14. The LIST 6 option has been added to SHELXL for producing both small
and macromolecule CIF format reflection data files for deposition; this
format has been approved by the PDB. Several options in SHELXPRO also require
a .fcf file created with LIST 6 in SHELXL. The unit-cell and symmetry operators
have been added to all the CIF format .fcf output files (making them incompatible
with the old versions of CIFTAB and XCIF !).
15. The R-factors etc. at the end of a SHELXL refinement are written
as comments (REM) to the .res file after the HKLF instruction. If it is
desired to accumulate them to provide a summary of the refinement, they
should be moved in front of the HKLF instruction each time the .res file
is edited to prepare the .ins file for the next refinement job. The 'U'
option in SHELXPRO does this automatically.
16. The anisotropic scaling proposed by Hope and Parkin has been incorporated
into SHELXL and SHELXPRO. The latter is intended as a quick check to see
whether it is worth including the extra 12 parameters into the SHELXL refinement.
These are stored as BASF parameters, so the HOPE instruction should be
followed by n (where n is the number of the first BASF parameter used for
this purpose); if n is negative the parameters are held fixed. This enables
the BASF parameters up to n-1 inclusive to be used for other purposes such
as the refinement of twinned crystals. The default value of n is 1, and
if the instruction HOPE is given in SHELXL without any BASF parameters,
the program will invent suitable starting values (it is much better to
leave this to the program). Anisotropic scaling is likely to be effective
for structures that are refined isotropically; the parameters are very
highly correlated with the individual anisotropic displacement parameters
and so are not useful for full anisotropic refinements.
17. The modeling of diffuse solvent has been changed so that it uses
the formula proposed by P.C.Moews and R.H.Kretsinger, J.Mol.Biol. 91 (1975)
201- 228. The same formula is used by the program TNT, but the implementation
is different; in SHELXL the two extra parameters are refined in every cycle,
in TNT they are evaluated between cycles (which requires the use of two
further dummy parameters to allow for correlations). If the SWAT instruction
is input with no parameters, the program will invent suitable starting
values. Note that the old SWAT parameters might be unsuitable for the new
formula and so should be deleted if present in a file from SHELXL-93.
18. Anti-bumping restraints are now generated automatically by SHELXL
only for short contacts in which both atoms involved are of the (SFAC)
type C, N, O or S. BUMP now includes all (non-H) interactions that are
separated by more than three bonds in the connectivity array (after this
has been edited if necessary with FREE and BIND). If hydrogen atoms are
present, all short H..H distances in which the hydrogen atoms are separated
by more than two bonds also create anti-bumping restraints. PART numbers
and occupancies are taken into account. BUMP is followed by a single parameter,
the esd. The default esd is the first DEFS parameter. If the esd is negative,
the connectivity is also checked for symmetry equivalent atoms, otherwise
not. Thus for a positive esd, antibumping restraints are always applied
to short interactions with atoms not in the original asymmetric unit, even
if spurious bonds have been added automatically to the connectivity table.
This handles the common case of side-chains wandering two close to 2-,
3-, 4- or 6-fold axes in proteins (if there really is a crystallographic
2-fold axis through a disulfide bond, make the esd negative !). The antibumping
restraints are now recalculated each refinement cycle. Note that antibumping
restraints can still be added by hand (DFIX -d, if necessary combined with
If a BUMP instruction is present, the program prints a list of all 1,2-
and 1,3-distances (as defined by the connectivity table) that are NOT subject
to distance restraints. This will often reveal errors or omissions in the
restraints, though some omissions (e.g. involving heavier atoms) may be
intentional. It revealed two omissions (which no user had noticed) in the
standard restraints dictionary SHELXL.DIC supplied with SHELXL-93.
19. The CHIV (and corresponding RTAB) instructions for SHELXL now use
the three bonded atoms in ASCII (alphabetical) order, so that the sign
of the chiral volume is independent of the order of atoms in the .ins file.
Suitable CA chiral volume restraints (and CB for Thr, Ile, Val and CG for
Leu - the latter two are not actually chiral but the restraints impose
the conventional labeling scheme!) are now generated by SHELXPRO when it
creats an .ins file. 'Planar' and 'chiral' CHIV restraints are now summarized
20. An instruction DANG has been added to SHELXL. It is intended for
1,3- distance (angle distance) restraints. It is exactly the same as DFIX
except that the default esd is twice the first DEFS parameter (default
0.04) instead of one times it (0.02), and the mean deviation is reported
separately in the output.
21. The diagnostic features have been substantially strengthened in
SHELXL. Bad restraints, hydrogen geometry etc. are reported but no longer
lead to termination. HFIX, AFIX and CHIV now have a deeper understanding
of PART numbers. FLAT restraints are still applied even if atoms are missing.
Atoms with unsuitable connectivity for CHIV or for the generation of hydrogen
atoms are ignored (with a comment) without aborting the job. New tables
give a variety of criteria to identify suspect residues in large structures,
such as: (a) minimum and mean difference density at atom centers, (b) maximum
difference peak within 2A of a residue (a peak is always assigned to one
residue, the nearest), (c) mean and maximum U or Ueq, (d) mean and maximum
anisotropy (the smallest divided by the largest principal U component),
and (e) the worst SIMU and anti-bumping restraint deviations. This information
is provided separately for main-chains, side-chains and solvent.
22. The SHELXL instruction MORE has been reorganized. MORE 2 is now
needed to get the full tables of restraints, but the summary of restraint
residuals appears after the last cycle even with MORE 0. A negative value
of MORE dumps the parameter list and full covariance matrix from the last
refinement cycle to a .mat file; this option cannot be used at the same
time as WPDB.
23. SUMP may now be applied to BASF, EXTI and SWAT parameters in SHELXL
(the numerical order is osf, free variables, BASF and then EXTI or SWAT).
This is useful in the initial stages of twinned crystal refinement.
24. SHELXL now estimates an initial scale factor only when the starting
value is 1.00000 (or FVAR is absent). This saves time and (in rare cases)
avoids refinement instability. If major changes are made to the model (e.g.
a diffuse solvent model is added or removed) it is advisable to reset the
overall scale factor (the first number on the first FVAR record) to 1.00000
to force the recalculation of an initial oeverall scale factor.
25. The NCSY instruction applies non-crystallographic symmetry restraints.
NCSY is followed by DN (no default), sigma14 (default 5 times the first
DEFS parameter), sigmaU (default equal to the fourth DEFS parameter) and
a list of atoms. For each atom SHELXL attempts to find an 'equivalent'
atom with the same name and a residue number DN greater than the residue
number of the named atom. If sigma14 is greater than zero, the connectivity
array is used to generate 1,4-distances for which both atoms are specified
in the same NCSY instruction; SADI restraints are then created to make
them equal to the corresponding 1,4-distances involving 'equivalent' atoms.
If sigmaU is greater than zero and both atoms are isotropic, SIMU restraints
are set up to make the U values approximately equal for a named atom and
its 'equivalent'. Usually NCS restraints are employed for isotropic (protein)
refinement at relatively low resolution; these restraints are more flexible
than NCS constraints becasue they impose NCS locally rather than globally,
and it is not necessary to specify a transformation matrix. In the case
of linear side-chains (e.g. lysine), a change in the sign of a gauche torsion
angle would not violate these restraints; chemically this is not unreasonable.
26. An new and very simple procedure is provided for defining disordered
residues in macromolecules for SHELXL. It enables the restraints etc. for
the corresponding undisordered residues to be used unchanged, and is also
compatible with standard Brookhaven PDB format. The different components
of a disordered group are distinguished only by their PART numbers; they
have the same atom names, residue names and residue numbers. In the .lst
output file their residue numbers are indicated by an additional letter
after the residue number ("a" means PART 1, "b" PART
2 etc.). Restraints etc. that reference such atoms are automatically duplicated
so that they are applied separately to each component of the disorder,
subject to the PART rules. It is not possible to reference specific atoms
in individual disorder components if this technique is used, but it turns
out that it is rarely necessary to do so.
27. SAME followed by a residue class may be inserted before FVAR
etc. in the .ins file for SHELXL. Thus: SAME_PHE CA > CZ could be inserted
before the atoms but would behave as if it were inserted just before the
first atom that fits CA_PHE. This is a very simple way of restraining geometrical
similarity for a macromolecule, for example when there are several copies
of a non-standard ligand, and is compatible with SHELXPRO.
28. A large selection of minor bugs have been fixed in SHELXL, including three separate bugs that could cause the program to crash after the last cycle but before calculating the Fourier map! In addition various adjustments have been made to the default values for esds etc., bond lengths are printed to four decimal places and the radial positional esd (calculated from the full covariance matrix) in an atomic position is printed under the atom name in the full atom coordinate table at the end of the refinement. Several of the algorithms, especially those involved in generating restraints before the first refinement cycle, should now be faster and more robust.
Back to main menu
9. Support and bug reporting
The author is happy to provide advice by email (email@example.com) or fax (+49-551-392582) but not phone. Questions already answered in this file or in the full documentation may be moved to the bottom of the pile! In particular he would like to be informed of any suspected bugs in the programs or of errors or lack of clarity in the documentation; the current release has benefitted enormously from such contributions by users.
Important announcement about new versions etc. will be posted on the SHELX homepage and on appropriate crystallagraphic email newsgroups.
Back to main menu