SQUASH Documentation

SQUASH is a density modification program suite that combines constraints from Sayre's equation, molecular averaging, solvent flattening and histogram matching (1D & 2D).

History

VMS version	10/AUG/88	Kam Zhang	Department of Physics, University of York, UK
UNIX version	1/OCT/89	Kam Zhang	Department of Physics, University of York, UK
further work	16/NOV/89	Kevin Cowtan	Department of Physics, University of York, UK
further extension	22/MAY/92	Kam Zhang	Molecular Biology Institute, University of California at Los Angeles, USA
SQUASH 1.0	21/DEC/92	Kam Zhang	Molecular Biology Institute, University of California at Los Angeles, USA
SQUASH 2.0	24/MAY/99	Yeu-Perng Nieh	Fred Hutchinson Cancer Research, Seattle, USA

References

Zhang, K. Y. J. (1993). SQUASH - Combining Constraints for Macromolecular Phase Refinement and Extension. Acta Cryst. D49, 213-222.
Nieh, Y. P. & Zhang, K. Y. J. (1999) A 2D histogram matching method for protein phase refinement and extension. Acta Cryst. D in press.

Please report problems or suggestions to kzhang@fhcrc.org

Introduction
Some hints and advises on SQUASHing
Logical names
Key word input cards
Examples
References

Introduction

SQUASH is an integrated density modification program for macromolecular phase refinement and extension that combines histogram matching, solvent flattening, Sayre's equation and molecular averaging.

The constraints used in these methods are the correct local shape of the electron density, equal molecules, solvent flatness and the correct distribution of electron density values and its gradients. These constraints on electron densities are satisfied simultaneously by solving a system of non-linear equations by Newton-Raphson method using FFT, and followed by a phase combination procedure.

In release 2.0, a two-dimensional (2D) histogram matching method was introduced that employs the joint probability distribution of electron density values and its gradient as a constraint for density modification. This has greatly enhanced the power of existing histogram matching methods which only utilized the constraint on electron density values.

In addition to the various density modification methods mentioned above, several useful operations are provided for users convenience, eg Wilson scaling, calculation of map from atomic coordinates, transformation between map and structure factors. Non-crystallographic symmetry operations can also be refined by a rotation and translation space search and a least squares minimization method, thereby reducing the chance of introducing systematic phase errors during averaging.

Constraints used in SQUASH and their implementation will be described in the following sections with more emphasis on the 2D histogram matching because it is a new feature in the program. More detailed information can be found in the references given.

Probability distribution of electron density values and its gradients - histogram matching

The density histogram(1D histogram) of a map is the probability distribution of electron density values. It specifies not only the permitted values of the electron density but also their frequencies of occurrence. It provides a global description of the appearance of the map and all spatial information is discarded.

The ideal histogram of an unknown structure can be predicted from well-defined structures due to the structure conformation independence among different structures. Proteins generally have very similar atomic composition and atomic bonding pattern. The differences are mainly the order with which amino acid residues are arranged and the dihedral angles adopted between adjacent neighboring residues. The density histogram discards the spatial information of the electron density maps and therefore is independent of the above factors that make each structure different. The density histogram captures the commonality between different structures: the similar atomic composition and the characteristic distances between atoms. These common features distinguish the correct structures from incorrect structures. Therefore, the ideal density histogram can be used to improve an electron density map or to select a correct phase set among many randomly generated phase sets in ab initio phasing.

The density histogram, however, is degenerate in encoding structural information. While having an ideal density distribution is a necessary condition for being a correct structure, it is not a sufficient condition. Incorrect structures may incidentally adopt the ideal density distribution. Moreover, the density histogram does not capture all the common features in protein structures. A two-dimensional (2D) histogram matching method was introduced in the release 2.0 which employs the joint probability distribution of electron density values and its gradient as a constraint in a density modification procedure.

The density gradient reflects the change of the density value within a local region and therefore provides a description of the local environment. The information from the gradient distribution is complementary to the density distribution, which only accounts for the value at a given point and disregards its neighboring environment. The addition of gradient in the 2D histogram reduces the degeneracy in the 1D density histogram.

Calculation of electron density gradients

The electron density value at position (x,y,z) as a Fourier transform of structure factors F(hkl) is shown in equation (1), where (x,y,z) are the fractional coordinates along the crystal axes , (hkl) are the indices of the structure factors and V is the unit cell volume. The gradients along each of the three crystal axes are shown in equations (2)-(4). The three gradient maps, gx, gy and gz can be efficiently calculated by FFT using the modified Fourier coefficients hF, kF, lF , respectively. The three gradient maps, gx, gy and gz are then transformed from the crystal axes system to orthogonal axes system before the accumulation of 2D histograms.

Calculation of structure factors from modified gradient maps

The gradient maps are modified by 1D histogram matching on gradient in an analogous way as the 1D histogram matching on density. Although the histogram matching is carried out on the density gradients, it is necessary to transform the modified gradient maps back to structure factors in order to apply the phase combination with the observed structure factors. Based on equations (2)-(4), we found that the structure factors can be calculated through inverse FFT with either of the three density gradients gx, gy and gz as the Fourier coefficients (equations (5)-(7)). Therefore, after the histogram matching on gx, gy and gz followed by the inverse FFT, three structure factor sets are generated, which are then averaged to give the final structure factor set.

2D histogram matching procedure

The 2D histogram matching on density and its gradients is achieved through two alternating steps of 1D matching on density and 1D matching on gradients. The histogram matching on density follows the method described by Zhang & Main(1990a). In this method, the new electron density value is derived from the old electron density value through a linear transform such that the cumulative distribution of the new density value equals to the cumulative distribution of the ideal histogram. The histogram matching on gradients also follows a similar protocol in which the density value was replaced by the gradients. The modified gradient maps were converted to the modified structure factors by the fast Fourier transform method.

The implementation of 2D histogram matching has been tested in two modes, the parallel mode and the sequential mode. For the parallel mode, the histogram matching on density and gradients is applied in parallel using the same initial structure factor set. After matching, the two new structure factor sets are combined. For the sequential mode, the structure factor set calculated after density histogram matching is used as input for the histogram matching on gradients and vice versa. Our test cases suggested that the histogram matching using sequential mode gave better phase improvement in less number of matching cycles and converged to higher histogram correlation coefficients compared to the parallel mode. Therefore, we suggest users use the sequential mode for the 2D histogram matching.

Generation of ideal histograms

The consensus histograms were generated based on the averaged histograms calculated from the atomic coordinates of 16 protein structures of different fold families, after removing the overall temperature factor from the electron density map. Therefore, these ideal histograms are independent of the temperature factor of protein structures. Since 2D histograms are resolution dependent, several ideal histograms were generated for a range of resolutions from 4 Å to 1.0 Å using the method describe by Goldstein and Zhang(1998).

Solvent flatness - solvent flattening

Solvent flattening exploits the fact that the electron density in the solvent region is featureless at medium resolution, owing to the high thermal motion and disorder of solvent molecules. The flattening of the solvent region suppresses noise in the map and therefore improves phases.

A solvent mask is needed for employing solvent flattening. If the user does not supply a solvent mask, it will be calculated by Wang's automated convolution algorithm(1985) using the reciprocal space approach of Leslie(1987). Once the envelope is determined, solvent flattening is performed simply by setting the density in the solvent region to the expected solvent density value.

Local shape of the electron density - Sayre's equation

Sayre (1952) pointed out that for equal and resolved atoms, the density distribution is equal to the squared density convoluted with an atomic shape function. Sayre's equation constrains the local shape of electron density. It is an exact equation at atomic resolution in an equal atom system. For macromolecular structures where atomic resolution data are seldom available, the shape function is modified to accommodate the overlapping of atoms at non-atomic resolution. Please see Main(1990), Zhang & Main(1990b) for details of the implementation of Sayre's equation.

Equal Molecules - molecular averaging

Molecular averaging enforces the equivalence of electron density values at grid points related by non-crystallographic symmetry(NCS). The averaging procedure can filter noise, correct system error, and even determine phases ab initio in favorable cases.

Determination of NCS and the mask

The self-rotation symmetry is now routinely solved by the use of a Patterson rotation function (Rossmann & Blow, 1962). The translation symmetry can be determined by a translation function (Crowther & Blow, 1967), when a search model, either an approximate structure of the protein to be determined or the structure of a homologous protein, is available. The search of the Patterson rotation and translation functions is achieved typically using software such as AMORE(Navaza,1994). In cases where no search model is available or the Patterson translation function is unsolvable, either the whole electron density map, or a region which is expected to contain a molecule, may be rotated using the rotation solution and used as a search model in a phased translation function(Read & Schierbeek, 1988).

Once the averaging operators are determined, the mask can be obtained using the local density correlation function as developed by Vellieux et al(1995). This is achieved by a systematic search for extended peaks in the local density correlation, which must be carried out over a volume of several unit cells in order to guarantee finding the whole molecule. The local correlation function distinguishes those volumes of crystal space that map onto similar density under transformation by the averaging operator. Therefore, in the case of improper NCS, a local correlation mask will cover only one monomer. In the case of a proper symmetry, a local correlation mask will cover the whole complex, since every operator will map one copy of the molecule onto another copy.

NCS refinement

The initial NCS operation obtained from rotation and translation functions or heavy atom positions can be fine-tuned by a density space R-factor search in the six dimensional rotation and translation space. The six-dimensional search is very time-consuming. The search rate can be increased by using only a representative subset of grid points. The NCS operation is systematically altered to find the lowest density space R-factor for the selected subset of grid points.

The translation search rate can be increased by using 3 fast Fourier transforms(FFT) according to the convolution theorem. Then, the 6-D rotation and translation space search can be reduced to a 3-D rotation space search with 3 FFT's.

The NCS operation solution from the 6-D rotation and translation search can be further refined by a least-squares procedure. The solution to the NCS parameters can be obtained by minimizing the density residual between the NCS related molecules.

Averaging of NCS related molecules

Once the mask and the matrices are determined, the electron density map may be modified by averaging. Averaging is carried out as proposed by Bricogne(1976) except that the double sorting procedure is avoided by storing the whole map in the memory.

Every grid point in the unit cell is mapped to the grid point within the subunit mask where the NCS holds and the NCS operation is subsequently applied.

The program can deal with a general non-crystallographic operation given the mask of the subunit where the NCS operator is valid.

The mapping procedure gives the appropriate crystallographic or non-crystallographic operation which could transform each grid point to the subunit mask. Thus, the mask for partitioning the protein from solvent can be derived from this mapping.

The average density value for each NCS related grid points is determined. The value assigned to each grid point can be adjusted toward the mean density according to a weight.

Phase combination

Once a modified map is obtained , modified structure factors can be calculated by inverse FFT. The MIR phase probability distribution is given by Blow and Crick(1959). The probability distribution for phases calculated from the modified map is determined using the Sim's weighting scheme (Sim, 1959) as adapted by Bricogne(1976). The phases are combined by multiplying their respective phase probabilities (Bricogne, 1976). This multiplication of phase probabilities is simplified by adding the coefficients that code for phase probabilities(Hendrickson & Lattmann, 1970).

Some hints and advises on SQUASHing

Memory usage
SQUASH uses dynamic memory allocation. The actual memory needed to run SQUASH depends on what functions you are running. For example, HM2D(2d histogram matching) needs more memory space than WILS(Wilson statistics) does. The grid sampling NX,NY,NZ in your structure determines the size of several big arrays which are used to store the structure factor data and map data. Be sure to set them appopriately. See grid sampling for details.
Scaling
If histogram matching or Sayre's equation is used, the structure factor amplitudes must be on absolute scale. The scale factor and overall temperature factor can be estimated from Wilson statistics (FUNC WILS) by giving the unit cell content. Note that it's difficult to estimate the exact unit cell content since there is a large volume of disordered solvent. A good estimate of the overall temperature factor can be achieved given correct ratio of chemical composition in the unit cell. The scale factor from Wilson statistics is generally underestimated since solvent contribution is either ignored or not given correctly. The program uses the histogram scaling to correct the Wilson scale factor by matching the variance of the protein region of the map to that of the ideal histogram.
On masking
The mask can be calculated automatically within the program and it will be updated after every cycle of refinement. The user can input a mask by assigning a file with logical name MASKIN. This is recommended when the molecule has flexible regions or at the later stage of map interpretation when the partial model is known.
The mask used for averaging is the MASK or MASKIN by default. The user can give a different mask for averaging by assigning the logical file name MASKAVE. This is used when the molecules are related by improper non-crystallographic symmetry or only portion of the molecule is related by NCS operation. The MASKAVE defines the single molecule which the NCS operations are valid for. You can also supply a mask, called MASKAVE2, which defines those molecules which are related by the NCS operations(this is optional).
Phase refinement
Sayre's equation works better at higher resolution. Experiments established that it works starting from even 3.5A. Nevertheless, it's recommended to give Sayre's equation a low weight when refining at lower resolution.
Phase extension
Extend the phase within a resolution range in a number of step. The reciprocal space vector length of the extended resolution should not be longer than that of the starting resolution.
On SIR phase improvement
If you want to use SQUASH to break the phase ambiguity, it's highly recommended to use Hendrickson & Lattmann coefficients instead of figure of merit. Since the best phase is the average of the two possible phases, the FOM can only give you a phase distribution centered in between the two possible phases. Whereas the A B C D truly represent the bimodal distribution centered on the two possible phases. It's easier to break up the phase ambiguity by density modification through phase recombination.
Graph utilities
SQUASH provides a CGI Perl script for generating graphs (eg. a graph of histogram correlation coefficient vs. refinement cycle) to give users graphical information about the density modification results. You need to have a Web server setup somewhere in your machine or others that you have access to and install the script under your web server's 'cgi-bin' directory. You also need to define the GGIBIN environmental variable that refers to the URL containing this CGI script (CGIBIN should have been defined in the 'squash.setup' file in $SQUTOP directory)
When you read in your browser the html-based log file (ie. ${HTMLBASE}.html), you will see in the index frame the link to 'Plot 2D histogram c.c.' (if you are running with s2hg function). If you follow that link, you will see the 'Plot' button in the content frame. You can click that button and SQUASH will initiate the CGI script. When asking for the file upload for plotting, give it the content frame file, ie. ${HTMLBASE}_log.html .
An off-line Perl script 'gengraph.pl' is also provided in the package for generating graphs without needing a Web server. This script by default can be found in the same directory as the SQUASH executable. When running this script, use the following command:
gengraph.pl 'logfile' 'output-gif-file'
where,
'logfile' is the content frame log file, ie. ${HTMLBASE}_log.html
'output-gif-file' is the file name for the output image file in GIF format.
In either cases, you need to have Perl (5.004 or later) and two Perl modules, GD(1.15 or later) and GIFgraph(1.10 or latest), installed. You can find Perl here and GD & GIFgraph modules here.

Logical names

HKLIN
input MTZ file.
HKLOUT
Output MTZ file.
HKLOUT contains those columns in the HKLIN and those extra columns specified by LABO.
HTMLBASE
HTML-based output log file.
Three html files will be output, ie. ${HTMLBASE}.html, ${HTMLBASE}_index.html and ${HTMLBASE}_log.html .
You can use your browser to read ${HTMLBASE}.html which will display two frames within its page, one is the index frame (${HTMLBASE}_index.html) and the other is the content frame (${HTMLBASE}_log.html). Or you can simply read the content page itself, ie. ${HTMLBASE}_log.html .
MAPIN
Map input (optional).
If given, the program will read in the map as a starting map for SQUASH. The map can in any axis order. It will be permuted by the program.
MAPOUT
Map output (optional).
If given, the program will write the final map after SQUASH in any given axis order.
MASK
Molecular envelope file, needed for histogram matching and solvent flattening. The mask will be calculate by the program.
MASKIN
Input mask from user which partitions the protein from solvent. Activated by assigning file to MASKIN. It will override MASK. It must be in packed CCP4 format for the full unit cell.
MASKAVE
Input mask for averaging which specifies the region of map where the non-crystallographic symmetry(NCS) is operated upon.
This must be the mask corresponding to the single subunit of protein where the NCS holds in case of improper NCS. Activated by assigning a file name to MASKAVE. If not given, the MASK or MASKIN will be used for averaging.
MASKAVE2
Input mask for averaging which specifies the region of map where the non-crystallographic symmetry(NCS) holds.
This must be the mask corresponding to the subunits that are related by the NCS operations. Activated by assigning a file name to MASKAVE2. If not given, the MASKAVE2 will be generated from MASKAVE and NCSGrid which defines the box where NCS are valid.
SYMOP
symmetry operation file
XYZIN
Input PDB file needed for some functions such as 'FUNC ATSF'

Key word input cards

AXIS CELL CONT DPLB END FCLB FORM FUNC GRID HKLW HMCT LABI LABO LOOK MATR MNXH MODE NCSG NGAU NMUL PRTV RANG RESO ROTR RSCB SCAL SIGM SOLC SOLV SPLL SYMM TITL XYZL

Note: All keywords are case insensitive and only the first 4 characters will suffice. Items are in free format input and separate by space or comma. Those items in square bracket are optional. Items in curly bracket are the alternatives.

AXIS - Order of fast, medium and slow(section) axis for output map.
AXIS Z, X, Y (optional) (default)
Back to the keyword list
CELL - Cell dimensions
CELL A, B, C (in Å), alpha, beta, gamma (in degree)
Back to the keyword list
CONT - Unit cell content.
Program stores scattering factor for most of the elements in the periodic table. Each atom type denoted by the convention of periodic table and followed by the number of that atom in the unit cell. Additional form factors can be read in from FORM card.
Back to the keyword list
DPLB - DPhi LaBels
DPLB Rmax Phi1 Phi2 FO [SIGFO]
Program calculates the phase error between Phi1 and Phi2 and is implemented through the function CORR. Rmax is the maximum value for the resolution for the calculation. Phi1 and Phi2 are the assigned column names for the phase data to be compared. FO is the assigned column name for Fobs.
Back to the keyword list
END - End data card
END must be placed at end of .com file to signify end of data cards
Back to the keyword list
FCLB - F Correlation LaBels
FCLB Label1 Label2
They are used in conjunction with the function CORR. Label1 and Label2 are character strings as specified in the LABO or LABI assignments. SQUASH determines the correlation between two specified sets of reflections at the end of the run. Label1 is typically Fobs and Label2 is the assignment for FCOUT.
Back to the keyword list
FORM - Form factor input
FORM AtomName, a1, a2, b1, b2, c
AtomName - name of the atom.
a1, a2, b1, b2 and c - the atomic scattering factors in the Intl. Tables.
Back to the keyword list
FUNC - Function(s) to apply
FUNC ['wilson'], ['sayr'], ['solv'], ['hm2d'], ['averaging'], ['ncsrefine'],['rsearch'],['patsearch']
Choose any one or a combination of above character strings to specify what kind of function that the program should perform.
where,
- aver - Averaging
- hm2d - Histogram matching(1d & 2d)
- ncsr - Non-crystallographic refinement by least squares
- pats - NCS refinement by convolution
- rsea - NCS refinement by real space R-factor search
- sayr - Sayre's equation
- solv - Solvent flattening
Additional functionas available are:
- atmp - calculates electron density map from input PDB file
- atsf - calculates structure factors from input PDB file
- corr - calculates correlation between Fo and Fc and determines phase error
- mask - calculates solvent mask from input reflection data
- mpsf - calculates structure factors from input electron density map
- sfmp - calculates electron density map from input reflections
- spli - "splices" together input and extended phases for continued refinement and extension
- wils - determines overall temperature factor and absolute scale factor for input reflection data
Back to the keyword list
GRID - Number of grids along X, Y and Z axes.
GRID NX, NY, NZ
See restrictions below and beware that prime factors can't be higher than 19.
The electron density is sampled according to this grid setup, so if it is too coarse you will introduce errors into the structure factor calculation. On the other hand the time and the memory required are largely dependent on the HMAX, KMAX, LMAX (calculated by program from cell dimensions and maximum 4SIN**2/L**2) and NX, NY, NZ, so it is advantageous to keep them as small as possible.
NX, NY and NZ must always be greater than 2*HMAX, 2*KMAX and 2*LMAX, respectively. Recommended grid sampling is 1/2 of your data resolution and 1/3 if you are running functions involving gradient calculation, such as 2D histogram matching.
Back to the keyword list
HKLW - Weighting scheme and radius of sphere for averaging
HKLW mode, radius

mode : 1 - weight by 1 - r/R

: 2 - weight by 1 - (r/R)²

radius - in Å (recommended between 6 - 10Å)

Back to the keyword list
HMCT - 2d histogram matching mode
HMCT mode
This card is used with 'FUNC HM2D' ,which performs 2d histogram matching. The following three modes are available. The 2d sequential mode is highly recommended!

mode : 1 - 1d matching on rho only

: 2 - 2d sequential mode (1d matching on rho + 1d matching on gx, gy, gz)

: 3 - 2d parallel mode (1d matching on rho + 1d matching on gx, gy, gz)

Default : HMCT 2
Back to the keyword list

LABI - Input labels for reflection data

LABI FO=? [SIGFO=?] PHIO=? [FOMO=?] [AO=?] FC=? [SFC=?] PHIC=? [FOMC=?] [AC=]

? represents the appropriate column labels.

FO, PHIO, FC and PHIC are compulsory.

FO : Observed Structure factors

SIGFO : 'standard deviation' of FO

PHIO : Experimental phase angle in degrees

FOMO : Figure of merit for PHIO

AO : Hendrickson & Lattmann coefficients(A B C D) for phase probability

The above are the data used for phase recombination. So the magnitude, phases and figure merits should be experimental ones such as MIR phases and FOM.

You just need to assign program label AO if you want to use Hendrickson and Lattmann coefficients. It assumes that B C D follows right after A in the MTZ file.

Note: A B C D is highly recommended over FOMO if you are trying to improve SIR phases.

FC : Structure factor magnitudes for the initial map

SFC : 'standard deviation' of FC

PHIC : Initial phase angle in degrees

FOMC : Figure of merit for PHIC

AC : Hendrickson & Lattmann coefficients (A B C D) for phase probability

The above are the structure factor, phase and FOM to calculate the initial map. Note that FC, SFC, PHIC and FOMC are normally the same as FO, SIGFO, PHIO and FOMO but they can be different, such as when you want to initiate the calculation with a modified map. In this case, the FC, SFC, PHIC and FOMC are those corresponding to the modified map.

You just need to assign program label AC if you want to use Hendrickson and Lattmann coefficients. It assumes that B C D follows right after A in the MTZ file.

Back to the keyword list

LABO - Output lables for reflection data

LABO FCOUT=? PHICOUT=? [FOMCOUT=?] [AOUT=?]

? represents the appropriate column labels.

FCOUT : the column label for the output magnitudes

PHICOUT : the column label for the output phases

FOMCOUT : the column label for the combined figure of merit

You can also write out Hendrickson & Lattmann coefficients for the combined phase probability by given

FCOUT : the column label for the output magnitudes

PHICOUT : the column label for the output phases

ACOUT : the column label for the output H-L coefficient A

BCOUT : the column label for the output H-L coefficient B

CCOUT : the column label for the output H-L coefficient C

DCOUT : the column label for the output H-L coefficient D

Note: The program automatically calculates the phase and FOM from the Hendrickson-Lattman coefficients and vise versa. Therefore, you only need to supply the phase or H-L coefficients. However, if both phase and H-L coefficients are given, the phase takes priority.

Back to the keyword list

LOOK - Switch controls the output of intermediate calculation results
LOOK key

key - 0: Minimum output (default)

1: Monitor the conjugate gradient solution

2: Monitor the intermediate maps

3: Monitor the least squares NCS refinement.

The higher the number, the more output!
Back to the keyword list
MATRIX - NCS operation in matrix form

MATR m11 m12 m13

m21 m22 m23

m31 m32 m33

m41 m42 m43

The first three lines are the rotation matrix elements. The last line is the translational component.
Back to the keyword list
MNXH - Minimum and maximum indices
MNXH Hmin Kmin Lmin Hmax Kmax Lmax
Hmin,Kmin,Lmin,Hmax,Kmax and Lmax are the minimum and maximum value of indices H K and L respectively. This is used only you want to override the calculated values by the program.
Back to the keyword list

MODE - Control mode

MODE Method, ThetaType, [Weight1], [Weight2]

method : 1 - full matrix calculation

: 2 - diagonal approximation (default)

ThetaType : 1 - analytical formula (for atomic resolution data)

: 2 - numerical curve fitting

: 3 - empirical data (default, preferred for proteins)

Weight1 - relative weight between density modification (histogram matching/solvent flattening/averaging) and Sayre's equation. The scale is multiplied on the density modification part of the equation.(Default is 1)

Weight2 - weight on averaging. (Default is 1)

Back to the keyword list

NCSGRID - Box in ranges of grid points which contains all the NCS related molecules
NCSG nxmin, nxmax, nymin, nymax, nzmin, nzmax
The parameters are the minimum and maximum value along A, B and C directions.
Back to the keyword list
NGAUSS - Number of coefficients to use for gaussian expansion of the electron density.
Use 5 for this calculation.
Note: use this card in conjunction with the FORM card.
Back to the keyword list
NMUL - Systematic absences check
NMUL nmult, jhe, jke, jle
Reflections satisfy jhe*H + jke*K + jle*L = nmult*N are possible reflections, where N is any integer.
Default: NMUL 0, 0 ,0, 0
Back to the keyword list
PRTVAL - Average protein density
PRTV PrtAve
PrtAve - average protein density value (default: 0.43 e/Å³)
Back to the keyword list

RANGE - Range and stepsize for real space R-factor search or convolution search.

RANG nalpha, nbeta, ngamma, drot, nx, ny, nz, dtrans

nalpha - number of search steps along Alpha

nbeta - number of search steps along Beta

ngamma - number of search steps along Gamma

drot - stepsize for rotation in degrees

nx - number of search steps along X

ny - number of search steps along Y

nz - number of search steps along Z

dtrans - stepsize for translation in angstroms

nx,ny,nz,dtrans are not used for convolution search.

Back to the keyword list

RESO - Resolution limits ( in Å)
RESO rmin, rmax [, rext, rstp ]

rmin - minimum resolution

rmax - maximum resolution

rext - extended resolution (default: rext=rmax)

rstp - number of steps needed for extension

Back to the keyword list
ROTRAN - Rotation and translation for the NCS operation
The input can be in Eulerian angle, spherical polar or direction cosine. The translation must be in orthoganol system in Å.
a) ROTR EULER alpha, beta, gamma

tx, ty, tz

alpha, beta, gamma are Eulerian angles in (Z Y Z) rotation. tx, ty and tz are the translational part.
b) ROTR SPHER omega, phi, kappa

tx, ty, tz

omega, phi and kappa are spherical polar angles. omega is the angle with respect to Z axis. phi is the angle with respect to X axis on the XY plane. kappa is the rotation angle. tx,ty and tz are the translational part.
c) ROTR DCOS dcx, dcy, dcz, kappa

tx, ty, tz

dcx,dcy and dcz are the direction cosine of the rotation axis with respect to X Y and Z. kappa is the rotation angle. tx,ty and tz are the translational part.
Back to the keyword list
RSCB - Resolution limits ( in Å) only for Wilson scaling.
RSCB rmin, rmax [,nbin]

rmin - minimum resolution

rmax - maximum resolution

nbin - number of bins in resolution range(ranges from 1-100. default: nbin=50)

Back to the keyword list
SCALE - Scale factor and overall B factor to put Fobs on absolute scale.
SCAL K, B
F'obs = K * Fobs * EXP(-B * S)
where S = (SIN(theta)/lambda)**2
Default: SCAL 1.0 0.0
Back to the keyword list
SIGMA - Reflection acceptance threshold
SIGM sigma
Reflections with Fobs < sigma * sd(Fobs) will not be included in the calculation.
Default: SIGM 0.0
Back to the keyword list
SOLC - Solvent content in the unit cell
SOLC SolvCont
SolvCont - solvent content in percentage
Back to the keyword list
SOLVAL - Average solvent density
SOLV SolAve
SolAve - average solvent density value (default: 0.32 e/Å³)
Back to the keyword list
SPLL - Splicing labels
SPLL PhiLabel P FOMlabel W
Use in conjunction with function SPLICE. User provides Philabel and FOMlabel, character strings which assign names to the spliced sets of data. If splicing alone, LABI must define PHIO, PHIC, FOMO, FOMC. If splicing in conjunction with other SQUASH functions, the program will assign PHIC and FOMC.
Back to the keyword list

SYMM - Symmetry input

SYMM SpGrpNumber {SpGrpName} {symmetry operator}

a) SpGrpNumber - space group number

b) SpGrpName - space group name in character

c) Symmetry operator - has to be in Intl. Tab. format.

Each operator precedes with SYMM card and separated by comma or space.

Default: SYMM 1

Back to the keyword list

TITLE - Job title
TITL ctitle
ctitle - character string (maximum 80 characters) written to the header of HKLOUT
Default: TITL squash
Back to the keyword list
XYZLIM - Range of map to be used for NCS refinement.
XYZL nxmin, nxmax, nymin, nymax, nzmin, nzmax
The parameters are the minimum and maximum value along A, B and C directions.
Back to the keyword list

Note: The keywords highlighted in magenta in the examples are keywords that are specific for the topic of that example.

Wilson scaling

#!/bin/sh
name=wilson
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
HTMLBASE=$name
set +a	
squash <<MY-DATA
TITL 2ZN insulin MIR data, Wilson plot
FUNC wils
CELL 82.5 82.5 34.0 90 90 120
SYMM 146
GRID 128 128 54
RESO 1000  1.5
RSCB 4.0 1.5 50
CONT C 4662 N 1170 O 3861 S 108 H 11907 ZN 6 NA 6
NMUL 3 -1 1 1
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM 
END
MY-DATA

Back to the topic of examples

NCS refinement

Refining NCS by least squares algorithm

#!/bin/sh
name=ncs_ncsr
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MASK=${name}.msk
MASKAVE=INS_mola.msk
MASKAVE2=INS_mola_b.msk
HTMLBASE=$name
set +a	
squash <<MY-DATA
TITL Refining Non-crystallographic symmetry by least squares method
FUNC ncsr 
MODE 2 3 1   
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
NMUL 3 -1 1 1
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Refining NCS by real space R-factor search method

#!/bin/sh
name=ncs_rsea
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MASKAVE=INS_mola.msk
HTMLBASE=$name
set +a	
squash <<MY-DATA
TITL Refining Non-crystallographic symmetry by R-factor search method
FUNC rsea 
MODE 2 3 1  
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM R3
RESO 1000.0 3.0
NCSG -33 43 -4 45 -31 30
RANG   7 7 7 2.0 5 5 5 1.0
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
NMUL 3 -1 1 1
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Refining NCS by convolution method

#!/bin/sh
name=ncs_pats
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MASKAVE=INS_mola.msk
HTMLBASE=$name
set +a	
squash <<MY-DATA
TITL Refining NCS by convolution method
FUNC pats
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0
RANG   7 7 7 2.0 5 5 5 1.0
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.0 0.0
NMUL 3 -1 1 1
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Refining NCS by R-factor search followed by least squares algorithm

#!/bin/sh
name=ncs_rsea_ncsr
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MASK=${name}.msk
MASKAVE=INS_mola.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Refining NCS by R-factor search & LSQ algorithm
FUNC rsea,ncsr
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
RANG   7 7 7 2.0 5 5 5 1.0
SYMM R3
RESO 1000.0 3.0
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.0 0.0
NMUL 3 -1 1 1
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Phase refinement

2D histogram matching

#!/bin/sh
name=hm2d
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL 2ZN insulin MIR data, 2D h.m
FUNC hm2d
HMCT 2
AXIS Z X Y
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9 
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW  1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM
LABO FCOUT=F2D PHICOUT=PHI2D FOMCOUT=FOM2D
END
MY-DATA

Back to the topic of examples

Solvent flattening

#!/bin/sh
name=solv
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL 2ZN insulin MIR data, solvent flattening
FUNC solv
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSF PHICOUT=PHISF FOMCOUT=FOMSF
END
MY-DATA

Back to the topic of examples

Histogram matching and solvent flattening

#!/bin/sh
name=solv_hm2d
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL 2Zn Insulin phase refn by histogram matching and solvent flattening
FUNC solv,hm2d
HMCT 2
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSF2D PHICOUT=PHISF2D FOMCOUT=FOMSF2D
END
MY-DATA

Back to the topic of examples

Averaging and solvent flattening

#!/bin/sh
name=aver_solv
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
MASKAVE2=Ins_mola_b.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL solvent flattening and averaging for phase refinement
FUNC aver,solv
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.0 0.0
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Sayre's equation with diagonal approximation, solvent flattening and histogram matching

#!/bin/sh
name=sayr_solv_hm2d
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase refn by hm2d, solv and Sayre's equation (diagonal)
FUNC sayr, solv, hm2d
HMCT 2
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 3.0
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Sayre's equation with full matrix calculation, solvent flattening and histogram matching

#!/bin/sh
name=sayr_solv_hm2d
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase refn by hm2d, solv and Sayre's equation (full matrix)
FUNC sayr, solv, hm2d
HMCT 2
MODE 1 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Histogram matching, solvent flattening and averaging

#!/bin/sh
name=hm2d_solv_aver
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase refn by hm2d, solv & averaging
FUNC hm2d, solv, aver
HMCT 2
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.4 13.4
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Histogram matching, solvent flattening, averaging and Sayre's equation

#!/bin/sh
name=sayr_solv_hm2d_aver
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
MASKAVE2=Ins_mola_b.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase refn using Sayre eqn, hm2d, solv & averaging
FUNC sayr, hm2d, solv, aver
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.4 13.4
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Phase extension

2D histogram matching

#!/bin/sh
name=hm2d_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extension by histogram matching
FUNC hm2d
HMCT 2
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9 1.5 4
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=F2DE PHICOUT=PHI2DE FOMCOUT=FOM2DE
END
MY-DATA

Back to the topic of examples

Solvent flattening

#!/bin/sh
name=solv_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extension by solvent flattening
FUNC solv
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9 1.5 4
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSFE PHICOUT=PHISFE FOMCOUT=FOMSFE
END
MY-DATA

Back to the topic of examples

Histogram matching and solvent flattening

#!/bin/sh
name=solv_hm2d_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn by solvent flattening & histogram matching
FUNC hm2d, solv
HMCT 2
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9 1.5 4
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSF2DE PHICOUT=PHISF2DE FOMCOUT=FOMSF2DE
END
MY-DATA

Back to the topic of examples

Averaging and solvent flattening

#!/bin/sh
name=aver_solv_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
MASKAVE2=Ins_mola_b.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn by averaging & solvent flattening
FUNC aver,solv
MODE 2 3 1   
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0 2.0 10
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.0 0.0
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Sayre's equation with diagonal approximation, solvent flattening and histogram matching

#!/bin/sh
name=sayr_solv_hm2d_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn by solv, hm2d & Sayre's eqn (diagonal)
FUNC sayr, solv, hm2d
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9 1.5 4
SCAL 1.4 13.4
SOLC 0.30
NMUL 3 -1 1 1
HKLW 1 10.0
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Sayre's equation with full matrix calculation, solvent flattening and histogram matching

#!/bin/sh
name=sayr_solv_hm2d_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn by solv, hm2d & Sayre's eqn (diagonal)
FUNC sayr, solv, hm2d
MODE 1 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
SYMM 146
RESO 1000 1.9 1.5 4
SCAL 1.4 13.4
NMUL 3 -1 1 1
HKLW 1 10.0
SOLC 0.30
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Histogram matching, solvent flattening and averaging

#!/bin/sh
name=hm2d_solv_aver
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn by hm2d, solv & averaging
FUNC hm2d, solv, aver
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0 2.0 10
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.4 13.4
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Histogram matching, solvent flattening, averaging and Sayre's equation

#!/bin/sh
name=sayr_solv_hm2d_aver_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
MASKAVE2=Ins_mola_b.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn using Sayre eqn, hm2d, solv & averaging
FUNC sayr, hm2d, solv, aver
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0 2.0 10
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.4 13.4
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA

Back to the topic of examples

Miscellaneous
- Input initial map to start SQUASH
```
#!/bin/sh
name=sayr_solv_hm2d_aver_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPIN=zniso30a.map
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
MASKAVE2=Ins_mola_b.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn by Sayre's eqn.,hm2d, solv & averaging
FUNC sayr, hm2d, solv, aver
MODE 2 3 1
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0 2.0 10
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.4 13.4
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA
```
  Back to the topic of examples
- Output final map after SQUASH
```
#!/bin/sh
name=sayr_solv_hm2d_aver_ext
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MAPOUT=${name}.map
MASK=${name}.msk
MASKAVE=Ins_mola.msk
MASKAVE2=Ins_mola_b.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL Phase extn by Sayre's eqn.,hm2d,solv &averaging
FUNC sayr, hm2d, solv, aver
MODE 2 3 1
AXIS Y X Z
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
NCSG -33 43 -4 45 -31 30
SYMM R3
RESO 1000.0 3.0 2.0 10
ROTR EULER    0.0   179.6   331.5
                0.007  -0.166  -0.432
SCAL 1.0 0.0
NMUL 3 -1 1 1
SOLC 0.30
HKLW 1 10.0     
LABI FO=FP SIGFO=SDFP PHIO=AISOB FOMO=FOM FC=FP SFC=SDFP PHIC=AISOB FOMC=FOM   
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
END
MY-DATA
```
  Back to the topic of examples
- Apply 2D histogram matching and then splice together input and extended phases for continued refinement and extension
```
#!/bin/sh
name=hm2d_splice
mtz=zn215asharp134e.mtz
#
set -a
SYMOP=${CLIB}/data/symop.lib
HKLIN=$mtz
HKLOUT=${name}.mtz
MASK=${name}.msk
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL 2D h.m. and splice the input and extended phases
FUNC hm2d, spli
CELL 82.5 82.5 34 90 90 120
GRID 128 128 54
RESO 1000 1.9 1.8 1
SYMM 146
SOLC 0.30
NMUL 3 -1 1 1
LABI FO=FP PHIO=AISOB FOMO=FOM FC=FP PHIC=AISOB FOMC=FOM
LABO FCOUT=FSQ99 PHICOUT=PHISQ99 FOMCOUT=FOMSQ99
SPLL PHISPL P FOMSPL W 
END
MY-DATA
```
  Back to the topic of examples
- Calculate structure factors from the PDB coordiate file
```
#!/bin/sh
name=atsf
pdb=z19fin.pdb
#
set -a
SYMOP=${CLIB}/data/symop.lib
XYZIN=$pdb
MAPOUT=${name}.map
HKLOUT=${name}.mtz
HTMLBASE=$name
set +a
squash<<MY-DATA
TITL calculate F from pdb
FUNC atsf
CELL 82.5 82.5 34.0 90 90 120
SYMM 146
SOLC 0.30
NGAU 5
GRID 128 128 54
RESO 1000.0 1.5
HKLW 1 10.0
LABO FCOUT=FCOUT PHICOUT=PHICOUT
END
MY-DATA
```
  Back to the topic of examples
  
  References
  1. Blow, D. M. & Crick, F. H. C. (1959) Acta Cryst., 12, 794-802.
  2. Bricogne, G. (1976) Methods and Programs for Direct-Space Exploitation of Geometric Redundancies. Acta Cryst., A32, 832-847.
  3. Goldstein, A. & Zhang, K. Y. J. (1998) The 2D histogram as a constrain for protein phase improvement. Acta Cryst. D54, 1230-1244.
  4. Crowther, R. A. & Blow, D. M. (1967) A method of positioning a known molecule in an unknown crystal structure. Acta Cryst. 23, 544-548.
  5. Hendrickson, W. A. & Lattman, E. E. (1970) Representation of Phase Probability Distributions for Simplified Combination of Independent Phase Information. Acta Cryst., B26, 136-143.
  6. Leslie, A. G. W. (1987) A reciprocal-space method for calculating a molecular envelope using the algorithm of B. C. Wang. Acta Cryst., A43, 134-136.
  7. Main, P. (1990a) The Use of Sayre's Equation with Constraints for the Direct Determination of Phases. Acta Cryst., A46, 372-377.
  8. Navaza, J. (1994) AMoRe: an Automated Package for Molecular Replacement. Acta Cryst. A50, 157-163.
  9. Read, R. J. & Schierbeek, A. J. (1988) A Phased Translation Function. J. Appl. Cryst. 21, 490-495.
  10. Rossmann, M. G. & Blow, D. M. (1962) The detection of sub-units within the crystallographic asymmetric unit. Acta Cryst. 15, 24-31.
  11. Sayre, D. (1952) The Squaring Method: a New Method for Phase Determination. Acta Cryst., 5, 60-65.
  12. Sim, G. A. (1959) The distribution of phase angles for structures containing heavy atoms. II. A modification of the normal heavy-atom method for non-centrosymmetrical structures. Acta Cryst., 12, 813-815.
  13. Vellieux, F. M. D., Hunt, J. F., Roy, S. & Read, R. J. (1995) DEMON/ANGEL: a suite of programs to carry out density modification. J. Appl. Cryst. 28, 347-351.
  14. Wang, B. C. (1985) Resolution of Phase Ambiguity in Macromolecular Crystallography. In Diffraction methods for biological macromolecules (Wyckoff, H. W., Hirs, C. H. W. & Timasheff, S. N., eds.), Vol. 115, pp. 90-113. Academic Press, Orlando.
  15. Zhang, K. Y. J. & Main, P. (1990a).Histogram Matching as a New Density Modification Technique for Phase Refinement and Extension of Protein Molecules. Acta Cryst. A46, 41-46.
  16. Zhang, K. Y. J. & Main, P. (1990b). The Use of Sayre's Equation with Solvent Flattening and Histogram Matching for Phase Extension and Refinement of Protein Structures. Acta Cryst. A46, 377-381.

mode : 1	- weight by 1 - r/R
: 2	- weight by 1 - (r/R)²
radius	- in Å (recommended between 6 - 10Å)

mode : 1	- 1d matching on rho only
: 2	- 2d sequential mode (1d matching on rho + 1d matching on gx, gy, gz)
: 3	- 2d parallel mode (1d matching on rho + 1d matching on gx, gy, gz)

FO	: Observed Structure factors
SIGFO	: 'standard deviation' of FO
PHIO	: Experimental phase angle in degrees
FOMO	: Figure of merit for PHIO
AO	: Hendrickson & Lattmann coefficients(A B C D) for phase probability

FC	: Structure factor magnitudes for the initial map
SFC	: 'standard deviation' of FC
PHIC	: Initial phase angle in degrees
FOMC	: Figure of merit for PHIC
AC	: Hendrickson & Lattmann coefficients (A B C D) for phase probability

FCOUT	: the column label for the output magnitudes
PHICOUT	: the column label for the output phases
FOMCOUT	: the column label for the combined figure of merit

key - 0:	Minimum output (default)
1:	Monitor the conjugate gradient solution
2:	Monitor the intermediate maps
3:	Monitor the least squares NCS refinement.

method : 1	- full matrix calculation
: 2	- diagonal approximation (default)
ThetaType : 1	- analytical formula (for atomic resolution data)
: 2	- numerical curve fitting
: 3	- empirical data (default, preferred for proteins)
Weight1	- relative weight between density modification (histogram matching/solvent flattening/averaging) and Sayre's equation. The scale is multiplied on the density modification part of the equation.(Default is 1)
Weight2	- weight on averaging. (Default is 1)

nalpha	- number of search steps along Alpha
nbeta	- number of search steps along Beta
ngamma	- number of search steps along Gamma
drot	- stepsize for rotation in degrees
nx	- number of search steps along X
ny	- number of search steps along Y
nz	- number of search steps along Z
dtrans	- stepsize for translation in angstroms

rmin	- minimum resolution
rmax	- maximum resolution
rext	- extended resolution (default: rext=rmax)
rstp	- number of steps needed for extension

a) SpGrpNumber	- space group number
b) SpGrpName	- space group name in character
c) Symmetry operator	- has to be in Intl. Tab. format.
	Each operator precedes with SYMM card and separated by comma or space.

MATR	m11 m12 m13
	m21 m22 m23
	m31 m32 m33
	m41 m42 m43

a) ROTR EULER	alpha, beta, gamma
	tx, ty, tz

b) ROTR SPHER	omega, phi, kappa
	tx, ty, tz

c) ROTR DCOS	dcx, dcy, dcz, kappa
	tx, ty, tz

SQUASH Documentation

History

References

Table of Contents

Probability distribution of electron density values and its gradients - histogram matching

Calculation of electron density gradients

Calculation of structure factors from modified gradient maps

2D histogram matching procedure

Generation of ideal histograms

Solvent flatness - solvent flattening

Local shape of the electron density - Sayre's equation

Equal Molecules - molecular averaging