Selection/deselection commands define the ON/OFF status of the atom records, thereby preparing objects for other EDPDB commands or other programs. For example, EDPDB can easily be used to select atoms for some graphics utilities, which can read a PDB file as input but might be less powerful in selecting atoms. EDPDB selects records based on matching one of the PDB fields (except entry number) with a selection criterion. Three dimensional intra- or inter- molecular distances, including between symmetry related molecules, can be used as selection criteria, too.
Selection strategies of different logic, eg. logic and, logic or and logic not, can be constructed with selection commands. In particular, one has the following:
logic or -- parallel selection statements, using a series of selection command consequently.
logic and -- see piping { ... | ... }.
logic not -- EXCLUDE, SWAP commands, and the EXCEPT option in ATOM and RESIDUE commands.
Function: Selection
Syntax:
load grp_id2 | ALIGN3D grp_id1 sub_grp_id1
[distance_cutoff [penalty_to_break]]
Note:
1) This routine creates two sets of atoms that are close to each
other in 3D from two groups of atoms.
One group is specified by grp_id1,
the other (grp_id2) is selected previously and
piped
to this command.
The matched atoms from
grp_id1 are stored in a group sub_grp_id1;
and those from the
piped-in group become ON atoms.
2) The matching is based on a balance between maximazing the number of matched atom pairs and minimazing the overall coordinate difference. Needleman's alignment algorithem is used. The score method for a given atom pair (i,j) is the following.
s(i,j) = max(0.0, (distance_cutoff**2 - distance(i,j)**2).The default value of distance_cutoff is 2.0. The default value of penalty_to_break is -distance_cutoff**2.
See also: EXTRACT, MATCH and MATCH3D.
Examples:
1) The following example calculates the least square rms
difference of alpha-carbone atoms between two homologous
structure A and B which
have been roughly overlaied on each other but may contain
insertion or deletion in their sequences.
initialize {ca | chain a \ group a} {ca | chain b | align3d a suba 3.0} overlay suba rtn.dat chain b rtn file rtn.dat ! apply the overlay matrix to molecule BThis process can be repeated to improve the match.
Function: Selection, Information
Syntax:
1) ATOM
2) ATOM atom_1 [atom_2 ... ]
3) ATOM EXCEPT atom_1 [atom_2 ... ]
Note:
1) The first form lists the atom names in the current input PDB
file. The second form selects up to sixteen types of atoms. The
third form selects every atom except the ones specified.
2) The specified atom names, atom_n, should be one of the
input atom names.
3) ATOM is the only selection command where a wildcard (* or %) is
accepted. '*' can only be used at the end of an atom_n and stands
for any character string.
'%' is used to replace one character in an atom_n.
RESIDUE and ATOM are the only two commands
where one can use the EXCEPT option.
4) Because of the wildcard (* or %) as well as the use of (') for default
delimiter in the command interpreter, atom names with these
characters included may not be handled properly unless the
wildcard or the delimiter is reset to some other character using
SETENV command.
Examples:
1) List out all the atom names. The number appearing in each pair
of brackets [ ] is the number of ON atoms for the corresponding
atom name.
atom2) Select every N, CA, C, CB atoms.
atom n ca c cb3) Select all atoms from a group of records (TMP), except those atoms whose names start with characters NE (eg. NE1, NE2)
{ load tmp | atom except ne*}4) Select all gamma atoms, eg. SG, OG1, CG1 etc..
atom %g*
Function: Selection
Syntax:
1) B (<, >, <>) cutoff(s) [ANGLE]
2) B MAX
Note: If the ANGLE option is selected, the values are considered as angles.
Examples:
1) Select atoms of 10.0 <= B <20.0
b < 20. exclude b < 10. or b <> 9.999 20. or { b > 9.999 | b < 20.3 }
Function: Selection
Syntax:
CA
Examples:
1) Select Ca atoms in region 1 - 50
{ ca | zone 1 - 50} or ca 1 - 50 ; this syntax works for the CA, MAIN and SIDE ; as well as the ZONE commands ; Do not use a syntax like ; { ca 1 - 50 | ... }2) Select every amino acids
ca ; select ca first more ; extent the selection to every residues3) Select every O5 atoms
dfca o5 ca
Function: Selection, Information
Syntax:
1) CHAIN
2) CHAIN [chn_mark_1 chn_mark_2 ...]
Note:
1) The first form lists the available chains. The numbers listed in
the parentheses are the number of records selected from the
corresponding chain.
2) A chain name is read in from the input PDB file. It can not be
redefined, although the corresponding text string can be edited
using SETC command.
3) A chain name in the residue ID can not be changed with the
SETC command. Therefore the chain name in the residue ID
may be different from that in the displayed text string.
See also: GROUP
Examples:
1) List all the chain names and the atom numbers in each chain
zone all chain2) Select chains A and C
initial chain a c3) Select atoms of B < 10.0 from chain A only
{ chain a | b < 10.0 } or { b < 10.0 | chain a }
Function: Selection, Definition
Syntax:
1) EXCLUDE
2) EXCLUDE another_selection_command
Note:
1) The first form sets the selection switch to OFF.
2) In the second form, the OFF status effects only the selection-
command which follows. The switch will be set back to
whatever it was before the current command.
See also: INCLUDE, INITIAL, RESET and SHDF
Examples:
1) Turn off the CB atom.
exclude atom CB2) Analyze the x, y, z, w, b of main chain, side chain atoms and the whole protein.
initial main analyze ; analyze the main chain more analyze ; analyze the whole protein exclude main analyze ; analyze the side chain3) Select the side chain atoms of all none charged residues
zone all exclude ; set the switch to OFF residue asp glu ; turn Asp Glu off residue arg lys ; turn Arg Lys off main ; turn the backbone atoms off include ; set the switch back to ON
Function: Selection
Syntax:
load grp_id2 | EXTRACT grp_id1 sub_grp_id1
Note: grp_id1 and grp_id2 are the ID of the two groups which should have the same number of atoms. The sub_grp_id1 contains a subset of atoms of grp_id1. The selected atoms will be a subset of grp_id2 which corresponds to the sub_grp_id1 of grp_id1.
Examples:
1) In the following example, we are going to superimpose two sets
of atoms from two models of the same protein molecule. One set
of atoms in model A (chain A) are within a sphere of radius 8.0 Å
and centered at CB atom of residue A10. The other set of atoms is
the corresponding atoms in model B (chain B), but they may not
fall in a sphere of radius 8.0 Å and centered at CB atom of residue
B10.
{ chain a \ group moda} ; define group moda { chain b \ group modb} ; define group modb { nayb 8.0 A10 cb | load moda \ group spha } ; select atoms from chain a within the sphere initialize { load modb | extract moda spha } ; select corresponding atoms from chain b overlay spha ; overlay b to aIf chain A and chain B are not identical (eg. they may have different number of atoms because of mutation), the following procedure may be required.
{ chain a \ group ch_a} { chain b \ group ch_b} { match ch_a moda | load ch_b \ group modb} ; make sure the B model (modb) has ; the same number of atoms as A model (moda). { nayb 8.0 A10 cb | load moda \ group spha } ; select atoms from a model (moda) initialize { load modb | extract moda spha} ; select corresponding atoms from b model (modb) overlay spha ; overlay b to a
Function: Definition, Information
Syntax:
1) GROUP
2) GROUP group_id
Note:
1) The first form will list the current non-empty group names.
2) The group_id is a character string of up to 4 characters. The
group specified with the group_id will be created/overwritten to
store the selected records.
3) The selection made within the subcommand will not affect the
current ON atoms (the main buffer). The default selection is
the ON atoms.
4) The group_id SCR is reserved for the program.
5) If the number of groups exceeds the program limit an error
message will appear: some groups must be deleted before a
new group can be defined.
See also: LOAD, SWAP and maximum number of groups.
Examples:
1) List all the currently defined group names
group2) Define the current ON atoms as group A
group a3) Define CA atoms as group g_ca without change the current ON atom list.
{ca | group g_ca }4) Delete the group g_ca by defining it as an empty group.
initial group g_ca or { group g_ca }
Function: Selection, Definition
Syntax:
1) INCLUDE
2) INCLUDE another_selection_command
Note:
1) The first form sets the selection switch to ON.
2) In the second form, the ON status effects only the selection-
command which follows. The switch will be set back whatever
it was before the current command.
See also: EXCLUDE, INITIAL, RESET and SHDF
Examples:
1) Set the selection switch to ON
include2) Turn the backbone atoms to ON
include main
Function: Selection, Definition
Syntax:
INITIAL
Note: INITIAL will change only the ON/OFF status (ie. pointer, or flag) of the records, but not any modification made to the records (eg. a coordinate transformation).
See also: RESET
Examples:
1) Turn all records OFF.
initial2) Analyze the x, y, z, W and B of the backbone and side chain atoms of zone 1 - 100.
{ main | zone 1 - 100 } analyze ; next to analyze the side chain of zone 1 - 100 initial ; the backbone atoms need to be turned OFF ; before selecting the side chain atoms. { side | zone 1 - 100 } analyze
Function: Selection
Syntax:
LOAD group_id
Note:
A group named as SCR can be defined with some
calculation commands such as DISTANCE.
Examples:
1) Calculate the backbone coordinate rms deviation between
molecules A, B and C, assuming they have the same number of
atoms.
{ main | chain a \ group a} { main | chain b \ group b} { main | chain c \ group c} initial load a ; turn the main chain atoms of molecule A ON overlay b ; get rms between A and B overlay c ; get rms between A and C initial load b ; turn the main chain atoms of molecule B ON overlay c ; get rms between B and C2) Select atoms in molecule B which contact molecule A.
{ chain a \ group mola} { chain b \ group molb} initial load mola distance molb 0.0 3.5 1 2000 LOAD ; the LOAD option of DISTANCE is turned on. ; it creates a new group SCR to store the contacted atoms; ; also the occupancy field of each ON atoms will be ; changed to the number of its neighbor atoms. initial load SCR ; select the atoms
Function: Selection
Syntax:
MAIN
Examples:
1) Select N CA CB C atoms
dfmain N Ca Cb C main2) Select main chain atoms in zone 1 - 50
{ main | zone 1 - 50 } or main 1 - 50 ; this syntax works for the CA, MAIN and SIDE ; as well as the ZONE commands
Function: Selection
Syntax:
load grp_id2 | MATCH grp_id1 sub_grp_id1
Note:
1) This routine creates two sets of common atoms shared
between list of records (common in terms of their atom name
and residue names). One list is specified by grp_id1,
the other is pecified by grp_id2.
The matched atoms from
group id1 are stored in sub_grp_id1, and those specified by
the grp_id2 become ON atoms.
2) The matching is based on the order of the atoms in each list.
The atom of each record are compared on a
residue-by-residue basis, and those that match are stored. Then
matches for the next residue in each record are compared, until
one of the lists are completed. The residue numbers are NOT
compared, residue comparisons are determined solely by their
ordering in each list.
See also: ALIGN3D, EXTRACT and GROUP
Examples:
1) The following example calculates the least square rms
difference between segments 1 - 162 of model A and of model B,
where model A is the wild type enzyme and model B is a mutant,
eg. M6I, in which the methionine at position 6 has been
substituted with a Isoleucine. In order to preform the OVERLAY
calculation, one need to select two sets of records between which
the records match one to one.
It is assumed that the atom order in each residue is the same in
both models, except for the residue 6. If it is not the case, the file
needs to be standardized (See the command
SORT DFRES).
{ zone a1 - a162 \ group wt } ; define model A as group wt { zone b1 - b162 \ group m6i} ; define model B as group m6i initial { load m6i | match wt wt_m } ; this command should be interpreted as ; select records from m6i which match with wt; ; the records in group wt_m are ; the matched atoms from model A (ie. the wild type); ; the selected records (ie. the ON atoms) are ; the matched atoms from model B (ie. m6i). overlay wt_m ; overlay the model B to model A2) Another way to program the above example.
{ zone a1 - a162 \ group a \ init} { zone b1 - b162 | match a wt \ group m6i} ; this command should be interpreted as ; match records from group a with those of "zone b1-b162" ; the records in group wt are ; the matched atoms from model A (ie. the wild type); ; the records in group m6i are ; the matched atoms from model B (ie. m6i). initial load m6i overlay wt
Function: Calculation, Selection
Syntax:
1) MMI radius [res_id [atom_name]]
2) MMI radius CENTER x, y, z
Note:
1) In the first format, the atom specified with res_id and
atom_name serves as the search center. The default atom of
the specified residue is the first atom in that residue. If there is
no res_id specified, the first ON atom will serve as the search
center.
2) The second form uses user specified x, y, z as the search
center.
3) The W column of the selected records will be changed to the
distance between the specified center atom and its neighboring
atoms.
4) The symmetry operator listed in the output should be applied to
the specified atom to achieve the contact. No unitary symmetry
operator is included in this calculation.
See also: MMIG, MMIR, MOVECENTER, NAYB and NAYBR
Examples:
1) Select atoms that are in 4.0 Å crystal contact with Nd2 atom of
residue 116.
cell 61.2 61.2 96.8 90.0 90.0 120.0 6 ; input the cell parameters @symmetry P3221 ; input symmetry operators mmi 4.0 116 Nd2
Function: Selection
Syntax:
MMIR radius res_id
Note:
1) Atoms in the symmetry related molecule that are within the
radius of any atom in the specified residue will be selected.
2) Cell parameters and symmetry operator(s) are required for this
command.
See also: MMI, MMIG, MOVECENTER and NAYBR
Examples:
1) Select atoms that are in 4.0 Å crystal contact with residue 116.
cell 61.2 61.2 96.8 90 120 6 ; input the cell parameters @symmetry p3221 ; input symmetry operators mmir 4. 116
Function: Selection
Syntax:
1) MORE [i0 [i1]]
2) MORE CHAIN
Note:
1) If there is any atom in the i-th residue that is currently ON,
MORE command will turn every atoms in the zone from
position (i+i0)th to position (i+i1)th to ON.
The default i0 is 0,
and the default i1 is i0.
2) MORE works in a positive way only. For example, if an atom
in a residue is currently ON, MORE command will turn every
atom in that residue ON, regardless of the status of the
INCLUDE/EXCLUDE switch.
3) The CHAIN option will expand ON status from a single atom
to the entire molecule.
Examples:
1) Select the protein molecule only, ie. all residues that contain Ca
atoms.
initial ca more or { ca \ more}2) Select all residues that contain atoms within 4.0 Å from residue 99.
initial naybr 4.0 99 ; select atoms more ; expand to residues or { naybr 4.0 99 \ more}3) Select all tripeptides that have a Gly as the middle residue.
initial residue Gly more -1 1
Function: Calculation, Selection
Syntax:
1) NAYB radius [res_id [atom_name]]
2) NAYB radius CENTER x, y, z
Note:
1) In the first format, the atom specified with res_id and
atom_name serves as the search center. The default atom of
the specified residue is the first atom in that residue. If there is
no res_id specified, the first ON atom will serve as the search
center.
2) No crystallographic symmetry information is required or used.
See also: AB, DISTANCE, MMI and NAYBR
Examples:
1) Select all polar atoms (ie. oxygen and nitrogen atoms) within
8.0 Å sphere from Oe1 of Glu 11.
{ nayb 8.0 11 Oe1 | atom O* N* }2) Select all atoms that are close (eg. 6.0 Å) to the point (10.0, 20.0, 30.0).
nayb 6.0 center 10.0 20.0 30.0
Function: Calculation, Selection
Syntax:
NAYBR radius res_id
Note:
No crystallographic symmetry information is required or used.
See also: AB, DISTANCE, MMIR and NAYB
Examples:
1) Select atoms within 4.0 Å shell from residue 11
naybr 4.0 112) Select residues that have atoms within 4.0 Å shell from residue 11.
initial naybr 4.0 11 more or { naybr 4.0 11 \ more}
Function: Selection
Syntax:
load grp_id | PATTERN pattern_string [position]
Note:
1) The pattern_string is a text string to be searched for.
One letter of the pattern_string is searched from one record.
For example, if the single letter code of a.a. is enbedded
in the text string of records of Ca atoms,
one can use this command to find the position of the a.a. sequence
'where' in the protein, and to select the five records of the
corresponding Ca atoms. '%' is used as a one-letter wild card, which
can be changed with the SETENV
command.
2) The postion is an integer. It tells where to look for the
letter that composes
the pattern. The default position is the chain mark
position (15).
Examples:
1) Find the a.a. sequence 'herewego'.
@aaa2a ! enbed single a.a. code at the chain mark position initialize { ca | pattern herewego 15 } ! your protein is unlikely to contain this sequence.
Function: Selection, Information
Syntax:
1) RESIDUE
2) RESIDUE res_type1 [res_type2 ... ]
3) RESIDUE EXCEPT res_type1 [res_type2 ... ]
Note:
1) The first form shows a list of residue types. The number
enclosed in the brackets are the numbers of selected residues
for every residue types.
2) RESIDUE and ATOM are the only commands where EXCEPT
keyword can be used.
Examples:
1) List out all the residue types, and count the number of each
type of residue.
zone all residue2) Select alanine residues.
residue ala3) Select all amino acid residues (ie. the residues that have Ca atoms) that have side chain atoms beyond the CB atom.
{ ca \ more | residue except Ala Gly } or initial { ca \ more } exclude residue ala gly
Function: Selection
Syntax:
SIDE
Note:
1) The main chain atoms are defined with DFMAIN command.
2) Non main chain atoms often include solvent molecules.
See also: CA, DFMAIN, MAIN and ZONE
Examples:
1) Select side chain atoms in zone 1 - 50
{ side | zone 1 - 50 } or side 1 - 50 ; this syntax works for the CA, MAIN and SIDE ; as well as the ZONE commands2) Select side chain atoms, including Ca atoms, of all Trp residues
dfmain n c o { side | residue trp }
Function: Selection
Syntax:
1) SWAP
2) SWAP group_id
Note:
1) In the first form, all the ON records will be switched to OFF,
and vice verse. By definition, the two sets of records have no
overlap.
2) In the second form, all the records in the specified group (if
there is any) will be turned ON; and the specified group will
be redefined as the previous ON records (if there is any). A
previous ON record will be turned OFF if it was not included
in the specified group previously. An overlap between the set
of the ON records and the set of the grouped records is
allowed.
See also: SORT
Examples:
1) Select all atoms
initial swap2) Select all non-protein atoms
initial ca more swap3) Calculate the rotation-translation matrix of overlaying molecule A to molecule B and that of overlaying molecule B to molecule A.
initial { ca | chain b \ group tgt } { ca | chain a } overlay tgt a_to_b.dat ; calculate matrix a_to_b swap tgt overlay tgt b_to_a.dat ; calculate matrix b_to_a
Function: Selection
Syntax:
TEXT text_string [t1, t2]
Note:
t1 and t2 specify the column number in the displayed text
string between which the given string will be searched. The default
is to search the entire displayed text string.
See also: PERMUTE and SETT UPDATE
Examples
1) select all the records that contain 12.345
.
text 12.3452) select all the records that contain
a
in the chain mark column.
text ' a ' 14 163) select all the records that contain both
cb
and arg
.
{ text cb | text arg}
Function: Selection
Syntax:
W (<, >, <>) cutoff(s) [ANGLE]
Note: If the ANGLE option is selected, the values are considered as angles.
Examples:
1) Select atoms that have W smaller than 1.0
W < 1.02) Select solvent exposed atoms (eg. those atoms of solvent accessible area (SAA) larger than 5.0 Ų).
initial zone 1 - 162 ; assume the protein molecule contains zone 1 - 162 access ; calculate SAA, overwrite the W field with SAA exclude W < 5.0 ; exclude the atoms with less than 5.0 Ų SAA.
Function: Selection
Syntax:
X (<, >, <> and ><) cutoff(s) [ANGLE]
Note: If the ANGLE option is selected, the values are considered as angles.
Examples:
1) Select atoms of X larger than 0.0
x > 0.02) Select atoms of -10.0 < x < 10.0
initial x <> -10.0 10.0
Function: Selection, Information
Syntax:
1) ZONE
2) ZONE res_id1 [res_id2 ... res_id16]
Note:
1) The first form shows the zone information. In the output, the number
enclosed in the brackets is the number of currently selected
records in the corresponding zone.
2) The res_idn can be a simple residue_id, a
relative residue_id or a complex residue_id,or a range.
This provides flexibility for writing macros.
a) A simple residue_id is a chain name (which can be a blank) immediately followed by the residue number. The simple res_id will be used as the register-zero for the next relative res_id if there is one.3) A keyword
b) A relative residue_id is a character `+' followed by an integer. It represents a residue separated from the register-zero by the given integer number of residues. The initial register- zero is the position before the first residue.
c) A complex residue_id is a chain name immediately followed by a relative residue_id. An underscore can be used for a chain which have a blank chain name. The corresponding residue_id is the one represented with the same text string but without the `+', and the register is adjusted so that the selected residue is the consistent with the relative residue_id in the text string.
d) A range contains two residue Ids separated with a hyphen `- ', or ` TO '.
first
stands for the first residue, and a keyword
last
strands for the last residue. A keyword ALL
is a shortcut of first - last
.
See also: GROUP and {subcommand}
Examples:
1) Select all atoms
zone all ; In this syntax, the keyword all means ; every records2) List the zone information
zone all zone ; list the number of selected atoms in each zone3) Select residues 1 and 3 in chain A, and the range from residues 5 to 10 in chain B.
zone A1 A3 B5 - B104) Select residues from A10 to A21
zone A10 - +11 ; where A10 is a simple residue_id ; and +11 is a relative residue_id with A10 as ; the register-zero. or zone a+10 - +21 ; where a+10 is a complex residue_id for A10 ; and +21 is a relative residue_id with A10 as ; the register 10. ;