PDBCUR (CCP4: Supported Program)
NAME
pdbcur
- a curation tool providing various analyses and manipulations of PDB files
SYNOPSIS
pdbcur xyzin
foo_in.pdb
xyzout
foo_out.pdb
[Key-worded input file]
DESCRIPTION
pdbcur provides various functions for analysing and manipulating
the contents of PDB files. The program is written using the new MMDB
library for coordinate data, and thus works with a hierarchical view
of the atomic model. This hierarchy is visible for example in the
atom selection syntax used.
INPUT AND OUTPUT FILES
XYZIN
Input coordinate file.
XYZOUT
Output coordinate file.
KEYWORDED INPUT
Example: renchain /*/A 'B'
Quotations are optional and are useful for designation 'no chain ID'.
Examples:
- rename A to 'no chain ID': renchain A ''
- rename 'no chain ID' to B: renchain /*// B
Example: renresidue (ALA) 'AL1'
Example: renatom CA[C] ' CC '
Example: renelement CA[C] 'AL'
Deletes the specified model(s).
Example (delete model #1): delmodel /1
Example (delete all models with chain A): delmodel /*/A
Deletes the specified chain(s).
Example (delete chain A in all models): delchain A
Example (delete chain A in 1st model): delchain /1/A
Deletes the specified residue(s).
Example (delete residues 33 to 120): delresidue 33-120
Deletes the specified atom(s).
Example (delete all C-gamma atoms): delatom CG[C]
Leaves the specified model(s), everything else is deleted.
Example (leave only model #1): lvmodel /1
Example (leave all models with chain A): lvmodel /*/A
Leaves the specified chain(s), everything else is deleted.
Example (leave chains A in all models): lvchain A
Example (leave only chain A in 1st model): lvchain /1/A
Leaves the specified residue(s), everything else is deleted.
Example (leave residues 33.A to 120.B): lvresidue 33.A-120.B
Leaves the specified atom(s), everything else is deleted.
Example (leave only C-alpha atoms): lvatom CA[C]
Writes 'xyzout' as a PDB, mmCIF or MMDB BINary
file. By default, the file is written in the format
of input file.
No parameters; this keyword generates PDB 'TER' cards.
No parameters; this keyword generates correct atom
serial numbers.
No parameters; moves solvent chains to the end of models.
Input of the space group symmetry name, e.g. 'P 21 21 21'
(without quotation marks, spaces _are_ significant).
This parameter is mandatory if coordinate file does not
specify the space group symmetry.
Input of the unit cell dimensions (space-separated
real numbers). This parameter is mandatory if coordinate
file does not specify the cell parameters.
Generating a unit cell as defined by crystallographic
information given in coordinate file or set up with
keywords 'symmetry' and 'geometry'. Chains generated
by identity operation retain their names, all other
are renamed as c_n, where c is the chain's original
name, and n is the number of symmetry operation in
the space group used (starting from 0 for identity
operation on). In order to comply with PDB standards,
the chains are then to be renamed using renchain
command, e.g. renchain A_2 H . The chains may be
assigned automatically generated 1-character names
using the command mkchainIDs .
Example: rnase.pdb contains 2 chains A and B.
Generate a unit cell, space group P 21 21 21, 4
symmetry operations, and assign chain IDs C,D,E for
chain A transformed by operations #1,2,3, and IDs
F,G,H for chain B transformed by the same operations.
Chains A and B transformed by 0th operation (identity)
retain their IDs:
pdbcur xyzin rnase.pdb xyzout ucell.pdb <<eof
? symm P 21 21 21
? genu
? renc A_1 C
? renc A_2 D
? renc A_3 E
? renc B_1 F
? renc B_2 G
? renc B_3 H
? eof
Declares (but does not apply) a symmetry operation.
The symmetry operations for each X,Y,Z fractional
coordinates must be written without spaces.
Pairs 'old chain ID' - 'new chain ID' specify how
the chains should be renamed after operation. This
input is not mandatory. If no renaming is specified,
the newly generated chains will be renamed automatically
(see keyword symcommit).
Example: symop Y+1/2,X-1/2,Z A S B R
(declare symmetry transformation x=Y+1/2, y=X-1/2, z=Z
with renaming chain A to S and B to R.
No parameters.
Applies all symmetry operations declared since
last symcommit statement. First operation (normally
identity) will be applied to the existing set of
coordinates, all other will be applied to the
duplicates of the coordinates, and the results
are merged.
The newly generated chains are named as C_n,
where C is the original chain name, and n is the
symmetry operation number. Symmetry operations
are numbered as they appear in symop statements,
from 0 on; however the very first one is applied
to the existing chains, which are not renamed in
this case.
Example:
pdbcur xyzin rnase.pdb xyzout rnase1.pdb <<eof
? symop X,Y,Z
? symop Y+1/2,X-1/2,Z
? symcommit
? eof
just adds two chains named A_1 and B_1, obtained
according to the rule Y+1/2,X-1/2,Z from chains
A and B, to the existing file.
Automatically generates 1-character chain IDs after
applying symmetry operations. The IDs are generated
such that they use all available letters starting
from A, and a chain is not renamed if its name is
already a 1-character one.
The following example
pdbcur xyzin rnase.pdb xyzout ucell.pdb <<eof
? symm P 21 21 21
? genu
? mkch
? eof
produces exactly the same result as that given for
keyword GENUNIT, because the original chains are named
sequentially as A,B (not G,I, for example).
Euler rotation of selected atoms through angles alpha,
beta and gamma (degrees) as applied to the initial
Z-axis, new Y-axis and newest Z-axis, correspondingly.
The rotation center is given by either orthogonal
coordinates x, y and z or by keyword 'center' for
specifying the mass center of the selected atoms.
Examples:
1. 90-degree rotation of chain A about Z-axis in
original coordinate system:
rotate A 90 0 0 0 0 0
2. 60-degree rotation of chains A and B about Y-axis
in the coordinate system of their mass center:
rotate 'A,B' 0 60 0 center
Rotation of selected atoms through angle alpha (degrees)
about a vector given by direction (vx,vy,vz) from the
rotation center (given as x,y,z or by keyword 'center'
for the mass center of the selected atoms). The vector
may also be specified by two atoms atom1 and atom2
represented in the mmdb selection notation.
Examples:
1. 90-degree rotation of chain A about Z-axis in
original coordinate system:
vrotate A 90 0 0 1 0 0 0
2. 60-degree rotation of chains A and B about Y-axis
in the coordinate system of their mass center:
vrotate 'A,B' 60 0 1 0 center
3. 45-degree rotation of all atoms about vector connecting
C-alpha atoms of residues 20.A of chain A and 55
of chain B:
vrotate /*/*/*/* 45 /1/A/20.A/CA[C] /1/B/55/CA[C]
or, if there is only one model in the PDB file:
vrotate * 45 A/20.A/CA[C] B/55/CA[C]
Specification of the selection sets:
- either
- /mdl/chn/s1.i1-s2.i2/at[el]:aloc
- or
- /mdl/chn/*(res).ic/at[el]:aloc
where no spaces are allowed. The slashes separate the
hierarchical levels of models, chains, residues and atoms.
Notations:
mdl - the model's serial number or 0 or '*' for any model
(default).
chn - the chain ID or list of chain IDs like 'A,B,C' or
'*' for any chain (default).
s1,s2 - the starting and ending residue sequence numbers
or '*' for any sequence number (default).
i1,i2 - the residues insertion codes or '*' for any
insertion code. If the sequence number other than
'*' is specified, then insertion code defaults to ""
(no insertion code), otherwise the default is '*'.
res - residue name or list of residue names like 'ALA,SER'
or '*' for any residue name (default)
at - atom name or list of atom names like 'CA,N1,O' or
'*' for any atom name (default)
el - chemical element name or list of chemical element
names like 'C,N,O', or '*' for any chemical element
name (default)
aloc - the alternative location indicator or list of
alternative locations like 'A,B,C', or '*' for any
alternate location. If the atom name and chemical
element name is specified (both may be '*'), then
the alternative location indicator defaults to ""
(no alternate location), otherwise the default is
'*'.
Values for chain IDs, residue names, atom names, chemical element
names and alternative location indicators may be negated by
prefix '!'. For example, '!A,B,C' for the list of chain names
means 'any chain ID but A,B,C'.
Generally, any hierarchical element as well as the selection
code may be omitted, in which case it is replaced for
default (see above). This makes the following examples valid:
* select all atoms
/1 select all atoms in model 1
A,B select all atoms in chains A and B in
all models
/1// select all atoms in chain without chainID
in model 1
/*/,A,B/ select all atoms in chain without chainID,
chain A and B in all models
33-120 select all atoms in residues 33. to 120.
in all chains and models
A/33.A-120.B select all atoms in residues 33.A to
120.B in chain A only, in all models
A/33.-120.A/[C] select all carbons in residues 33. to
120.A in chain A, in all models
CA[C] select all C-alphas in all
models/chains/residues
A//[C] select all carbons in chain A, in all models
(!ALA,SER) select all atoms in any residues but
ALA and SER, in all models/chains
/1/A/(GLU)/CA[C] select all C-alphas in GLU residues of
chain A, model 1
/1/A/*(GLU)./CA[C}: same as above
[C]:,A select all carbons without alternative
location indicator and carbons in alternate
location A
NOTE: if a selection contains comma(s), the selection sentence must
be embraced by quotation marks, which indicate to the input parser that
the sentence is a single input parameter rather than a set of comma-
separated arguments.
PROGRAM OUTPUT
The program currently gives a short summary of the operations carried
out.
EXAMPLES
Runnable example
pdbcur.exam
SEE ALSO
ncont - MMDB application for finding contacts.
pdbset - traditional PDB utility program.
AUTHORS
Eugene Krissinel, European Bioinformatics Institute, Cambridge, UK.