DDQ Difference Density Quality 1999 Version 1.0 Wbitten by Focco van den Akker in the lab of Prof. Wim Hol at the University of Washington, Seattle Current address: Dr. F. van den Akker Department of Molecular Biology/NB20 Cleveland Clinic Foundation Lerner Research Institute 9500 Euclid Av. Cleveland OH 44195 vandenf@ccf.org tel: 216-444-2057 The following website contains a more indepth description of the DDQ method and its local and global quality indicators. http://www.ri.ccf.org/labs/yee/ All Rights Reserved For any publication arising from use of the DDQ program, a reference to (van den Akker & Hol, 1997) should be given as: "Difference density quality (DDQ): A method to assess the global and local correctness of macromolecular crystal structures" Focco van den Akker & Wim G.J. Hol (1999) Acta D55, 206-218. Do not copy, modify, sublicence, distribute or transfer the DDQ program without the expressed, written consent of the author. Please do take the time to fill out the ddq.license form and send to the above address. Because the software is free of charge, there is no warranty for the software, to the extent permitted by applicable law. Aim of DDQ: --------------------------------------------------------------------------- DDQ is a program written to aid as a tool in the automatic assessment of the local and global accuracy of macromolecular crystal structures as well as assessing whether crystallographic refinement has been completed satisfactorily. DDQ is especially meant for structures which benefit from a high local accuracy such as targets for rational drug design. Basis of DDQ: --------------------------------------------------------------------------- The method is based on the fact that the local and global accuracy of a crystal structure are reflected by the information found in a hydrated difference map (defined as a difference map in which the water molecules are deliberately omitted from the structure factor calculations): -a correct and fully refined structure will have many strong positive peaks corresponding to optimally positioned water molecules and few, if any, nearby positive or negative 'shift peaks' due to incorrectly positioned residues/moieties. The presence of a correctly positioned water peak near a polar (or apolar) residue is used as confirming evidence for the local correctness of the structure at that particular residue. -a mostly correct structure will have enough atoms correctly positioned such that its phase error is low enough to 'pull out' positive water peaks in a hydrated difference map, though at lower contour levels, but in addition this map also highlights incorrect regions of the model (local accuracy) by the presence of nearby negative and or positive 'shift peaks'. -a completely incorrect model will have random phases and therefore a random hydrated difference map with few strong peaks randomly positioned. Limitations of DDQ: --------------------------------------------------------------------------- -DDQ as a global quality indicator will only work for structures with a resolution of 2.8 Angstrom or better since small electron density features such as water peaks are barely detectable beyond this resolution. The DDQ method should however still detect local errors, to a limited degree, for lower resolution structures. -DNA and RNA structures tend to have less well-order waters bound than proteins (which have more hydrophilic water cavities) and will therefore receive lower global DDQ scores than protein structures. The local DDQ indicators should however still be applicable to these structures. DNA/protein complexes have been tested and are fine. DDQ: the program --------------------------------------------------------------------------- The code is written in FORTRAN77. Compile as follows: f77 -o ddq.exe -static ddq_version.f (or for SGI (IRIX 6.3): f77 -o ddq.exe -col120 -static ddq24jan99.f and for Digital Unix : f77 -o ddq.exe -extend_source -static ddq24jan99.f) In addition, there are 3 fixed auxilary files: angles.dat atom_rad.dat symmetry.dat and one user-changeble file : param.dat which should be linked before running DDQ as shown in the following .com file: # rm DDQ* ln -s /usr/../atom_rad.dat atom_rad ln -s /usr/../angles.dat angles ln -s /usr/../symmetry.dat symmetry ln -s /usr/../param.dat param /usr/../ddq.exe # In the param.dat file the user has the option of changing the SPACEGROUP: in case PDB header doesn't contain the correct one WORSTPERC: the percentage of worst residue that will be sorted and listed in the DDQresults.dat RESOLUTION: will override the resolution found in the PDB header IDENT: identifier, will override the PDBidentifier found the PDB header SIGCUTOFF: absolute sigma cutoff for difference peaks which should be 3.0 in most cases for proper comparison of the global DDQ scores but can be lowered or raised if desired. WATERS: residue name of waters in pdb file (normally not needed as DDQ will recognize the waters automatically) WAT_CUTOFF_DIST: distance cutoff used for considering PDB waters to be equivalent to the waters found by DDQ to validate waters present in PDB file (default 1.0 Angstrom) The keywords can be commented out if not desired by a # or ! sign (i.e. #SPACEGROUP P212121), The atom_rad.dat file can be modified to incorporate additional atom types by adding them after the last entry. ***************************** INPUT Files : 1. list of positive and negative peaks in a 'hydrated' difference map 2. coordinate file (can include waters) add 1.) list of positive (> 3.0 sig) and negative (< -3.0 sig) peaks from a difference Fourier map in which the water molecules are DELIBERATELY removed from the structure factor calculation (= hydrated difference map). The map should cover the complete molecule (this will be verified). The list of peaks is in PDB Brookhaven format obtained by running PEAKMAX (CCP4) using the following .com file: #---------------------------------------------- rm pos_neg.peaks # Input: CCP4 map , or your can use an X-plor map # and convert it with 4d_mapman (Upsala program suite) # to the CCP4 format. peakmax mapin ccp4.map << eof > peaksearch.log THRESHOLD RMS 3.0 NEGATIVES NUMPEAKS 8000 OUTPUT BROOKHAVEN eof mv XYZOUT pos_neg.peaks #---------------------------------------------- add 2.) Coordinate file, can include water molecules, in PDB format including CRYST and SCALE cards. The program DDQ can handle multiple conformation as well as the presence of HETATM's such as DNA/ligands/metals/sugars etc. ***************************** OUTPUT files: 1. DDQresults.dat This file contains the global DDQ quality indicators for the structure which will be ranked automatically as either best, top 25%, above average, below average, bottom 25%, worst score compared to 10 randomly selected PDB structures in that particular resolution bin. In addition, the file results.dat contains the top 15% of residues with the highest DDQ-S scores which are potentially indicative of shortcomings in the model. One should always inspect residues with positive DDQ-S scores as they arrise from nearby difference peaks greater that 3 sigma. In case the PDB file contained waters, DDQ will analyze them and validate them against the water peaks found by DDQ. 2. DDQw_shell1.pdb, DDQw_shell2.pdb ... These files contain the water molecules in hydration shell 1, 2, ... and are PDB formatted in case the user would like to view these waters. The criteria for waters is strict: i) no other density peaks within 2.4 Ang, ii) within the optimal distance range of at least one polar atom (for O this is 2.4-3.4 Ang and for N is 2.5-3.5 Ang), iii) not within a distance of RMAXp (maximum cutoff distance for positive 'shift peaks') of any other atom (see file atom_rad.dat). Note that crystallographic symmetry related water molecules are highlighted in the REMARK section of the file.. 3. DDQunassigned.pdb This file contains the list of positive peaks that are neither 'shift-peaks' nor water peaks. These peaks could arise from parts of the molecule not (yet) modeled. The peaks are clustered by distance and each cluster is given an individual residue name in this PDB formatted file for ease of use. 4. DDQsub_X (X = chain identifier) This file contains the # of atoms, DDQ-P, DDQ-N, DDQ-W, DDQ-S (=sum of DDQ-P + DDQ-N) score for each main chain (M), side chain (S), and hetero/ligand moiety (H) as local quality indicators. The information per residue in this file can be plotted using GNUPLOT or EXCEL. Future version of this software will provide a postscript output of the DDQ scores per residue. 5. DDQoldWAT_ok.pdb PDB waters that agree with the DDQ waters 6. DDQoldWAT_okCS.pdb PDB waters that agree with DDQ waters via crystallographic symmetry 7. DDQnewWAT_okCS.pdb DDQ waters that are symmetry related to DDQoldWAT_okCS.pdb (use DDQnewWAT_okCS.pdb instead of DDQoldWAT_okCS.pdb in further rounds of refinement) 8. DDQnewWATmissing.pdb DDQ water peaks missing in the PDB file 9. DDQoldWAT_notok.pdb PDB waters that do not agree with the DDQ waters (which are > 3.0 sigma)