Selection/Deselection

Available commands: ALIGN3D , ATOM , B , CA , CHAIN , EXCLUDE , EXTRACT , GROUP , INCLUDE , INITIAL , LOAD , MAIN , MATCH , MMI , MMIR , MORE , NAYB , NAYBR , PATTERN , RESIDUE , SIDE , SWAP , TEXT , W , X,Y,Z and ZONE .

Selection/deselection commands define the ON/OFF status of the atom records, thereby preparing objects for other EDPDB commands or other programs. For example, EDPDB can easily be used to select atoms for some graphics utilities, which can read a PDB file as input but might be less powerful in selecting atoms. EDPDB selects records based on matching one of the PDB fields (except entry number) with a selection criterion. Three dimensional intra- or inter- molecular distances, including between symmetry related molecules, can be used as selection criteria, too.

Selection strategies of different logic, eg. logic and, logic or and logic not, can be constructed with selection commands. In particular, one has the following:

logic or -- parallel selection statements, using a series of selection command consequently.

logic and -- see piping { ... | ... }.

logic not -- EXCLUDE, SWAP commands, and the EXCEPT option in ATOM and RESIDUE commands.

ALIGN3D

Select atom pairs from two overlapped structures (groups) based on 3D distances so that the two structures can be aligned better in 3D.

Function: Selection

Syntax:
load grp_id2 | ALIGN3D grp_id1 sub_grp_id1 [distance_cutoff [penalty_to_break]]

Note:
1) This routine creates two sets of atoms that are close to each other in 3D from two groups of atoms. One group is specified by grp_id1, the other (grp_id2) is selected previously and piped to this command. The matched atoms from grp_id1 are stored in a group sub_grp_id1; and those from the piped-in group become ON atoms.

2) The matching is based on a balance between maximazing the number of matched atom pairs and minimazing the overall coordinate difference. Needleman's alignment algorithem is used. The score method for a given atom pair (i,j) is the following.

	s(i,j) = max(0.0, (distance_cutoff**2 - distance(i,j)**2).

The default value of distance_cutoff is 2.0. The default value of penalty_to_break is -distance_cutoff**2.

ATOM

Select specific types of atoms.

Function: Selection, Information

Syntax:
1) ATOM
2) ATOM atom_1 [atom_2 ... ]
3) ATOM EXCEPT atom_1 [atom_2 ... ]

Note:
1) The first form lists the atom names in the current input PDB file. The second form selects up to sixteen types of atoms. The third form selects every atom except the ones specified.
2) The specified atom names, atom_n, should be one of the input atom names.
3) ATOM is the only selection command where a wildcard (* or %) is accepted. '*' can only be used at the end of an atom_n and stands for any character string. '%' is used to replace one character in an atom_n. RESIDUE and ATOM are the only two commands where one can use the EXCEPT option.
4) Because of the wildcard (* or %) as well as the use of (') for default delimiter in the command interpreter, atom names with these characters included may not be handled properly unless the wildcard or the delimiter is reset to some other character using SETENV command.

B

Select atoms which have B value greater, less than, between or beyond the given value(s), or select the (1st) single record that has the max B factor.

Function: Selection

Syntax:
1) B (<, >, <>) cutoff(s) [ANGLE]
2) B MAX

Note: If the ANGLE option is selected, the values are considered as angles.

CA

Select CA atoms.

Function: Selection

Syntax:
CA

CHAIN

Select the atoms which have residue ID starting with the given chain name.

Function: Selection, Information

Syntax:
1) CHAIN
2) CHAIN [chn_mark_1 chn_mark_2 ...]

Note:
1) The first form lists the available chains. The numbers listed in the parentheses are the number of records selected from the corresponding chain.
2) A chain name is read in from the input PDB file. It can not be redefined, although the corresponding text string can be edited using SETC command.
3) A chain name in the residue ID can not be changed with the SETC command. Therefore the chain name in the residue ID may be different from that in the displayed text string.

EXCLUDE

Records selected following the EXCLUDE command will be set to OFF status until an INCLUDE, INITIAL or RESET command is issued.

Function: Selection, Definition

Syntax:
1) EXCLUDE
2) EXCLUDE another_selection_command

Note:
1) The first form sets the selection switch to OFF.
2) In the second form, the OFF status effects only the selection- command which follows. The switch will be set back to whatever it was before the current command.

See also: INCLUDE, INITIAL, RESET and SHDF

Examples:
1) Turn off the CB atom.

      exclude atom CB

2) Analyze the x, y, z, w, b of main chain, side chain atoms and the whole protein.

      initial
      main
      analyze     ; analyze the main chain
      more
      analyze     ; analyze the whole protein
      exclude main
      analyze     ; analyze the side chain

3) Select the side chain atoms of all none charged residues

      zone all
      exclude     ; set the switch to OFF
      residue  asp glu   ; turn Asp Glu off
      residue  arg lys   ; turn Arg Lys off
      main        ; turn the backbone atoms off
      include     ; set the switch back to ON

EXTRACT

Select atoms (from a given GROUP) based on the matching between atoms of other two groups.

Function: Selection

Syntax:
load grp_id2 | EXTRACT grp_id1 sub_grp_id1

Note: grp_id1 and grp_id2 are the ID of the two groups which should have the same number of atoms. The sub_grp_id1 contains a subset of atoms of grp_id1. The selected atoms will be a subset of grp_id2 which corresponds to the sub_grp_id1 of grp_id1.

GROUP

Define a set of atoms as one group, which can be used in calculations such as DISTANCE, MMIG, OVERLAY, DIFF as well as other selection commands.

Function: Definition, Information

Syntax:
1) GROUP
2) GROUP group_id

Note:
1) The first form will list the current non-empty group names.
2) The group_id is a character string of up to 4 characters. The group specified with the group_id will be created/overwritten to store the selected records.
3) The selection made within the subcommand will not affect the current ON atoms (the main buffer). The default selection is the ON atoms.
4) The group_id SCR is reserved for the program.
5) If the number of groups exceeds the program limit an error message will appear: some groups must be deleted before a new group can be defined.

See also: LOAD, SWAP and maximum number of groups.

Examples:
1) List all the currently defined group names

      group

2) Define the current ON atoms as group A

      group a

3) Define CA atoms as group g_ca without change the current ON atom list.

      {ca | group g_ca }

4) Delete the group g_ca by defining it as an empty group.

      initial         
      group g_ca
  or
      { group g_ca }

INCLUDE

Records selected following the INCLUDE command will be set to ON status until an EXCLUDE command is issued.

Function: Selection, Definition

Syntax:
1) INCLUDE
2) INCLUDE another_selection_command

Note:
1) The first form sets the selection switch to ON.
2) In the second form, the ON status effects only the selection- command which follows. The switch will be set back whatever it was before the current command.

See also: EXCLUDE, INITIAL, RESET and SHDF

Examples:
1) Set the selection switch to ON

      include

2) Turn the backbone atoms to ON

      include main

INITIAL

Set status of all atoms to OFF (ie. to empty the main buffer). Selection status is automatically switched to INCLUDE, ie. all atoms that are selected after INITIAL will be put into the main buffer.

Function: Selection, Definition

Syntax:
INITIAL

Note: INITIAL will change only the ON/OFF status (ie. pointer, or flag) of the records, but not any modification made to the records (eg. a coordinate transformation).

LOAD

Select a group of records which has been defined, for example, using the command GROUP.

Function: Selection

Syntax:
LOAD group_id

Note:
A group named as SCR can be defined with some calculation commands such as DISTANCE.

MAIN

Select the main-chain atoms. Main-chain atoms are defined with the command DFMAIN. The program default is N, CA, C and O atoms.

Function: Selection

Syntax:
MAIN

MATCH

Select atoms based on a matching with residues AND atom-names of a given group.

Function: Selection

Syntax:
load grp_id2 | MATCH grp_id1 sub_grp_id1

Note:
1) This routine creates two sets of common atoms shared between list of records (common in terms of their atom name and residue names). One list is specified by grp_id1, the other is pecified by grp_id2. The matched atoms from group id1 are stored in sub_grp_id1, and those specified by the grp_id2 become ON atoms.
2) The matching is based on the order of the atoms in each list. The atom of each record are compared on a residue-by-residue basis, and those that match are stored. Then matches for the next residue in each record are compared, until one of the lists are completed. The residue numbers are NOT compared, residue comparisons are determined solely by their ordering in each list.

MMI

Select the neighbor atoms of a given atom from symmetry related molecules. MMI (sounds strange!) stands for Molecular- Molecular-Interaction.

Function: Calculation, Selection

Syntax:
1) MMI radius [res_id [atom_name]]
2) MMI radius CENTER x, y, z

Note:
1) In the first format, the atom specified with res_id and atom_name serves as the search center. The default atom of the specified residue is the first atom in that residue. If there is no res_id specified, the first ON atom will serve as the search center.
2) The second form uses user specified x, y, z as the search center.
3) The W column of the selected records will be changed to the distance between the specified center atom and its neighboring atoms.
4) The symmetry operator listed in the output should be applied to the specified atom to achieve the contact. No unitary symmetry operator is included in this calculation.

See also: MMIG, MMIR, MOVECENTER, NAYB and NAYBR

Examples:
1) Select atoms that are in 4.0 Å crystal contact with Nd2 atom of residue 116.

      cell 61.2 61.2 96.8 90.0 90.0 120.0 6
           ; input the cell parameters
      @symmetry P3221
           ; input symmetry operators
      mmi 4.0 116 Nd2

MMIR

Select the neighbor atoms of a given residue from symmetry related molecules.

Function: Selection

Syntax:
MMIR radius res_id

Note:
1) Atoms in the symmetry related molecule that are within the radius of any atom in the specified residue will be selected.
2) Cell parameters and symmetry operator(s) are required for this command.

See also: MMI, MMIG, MOVECENTER and NAYBR

Examples:
1) Select atoms that are in 4.0 Å crystal contact with residue 116.

      cell 61.2 61.2 96.8 90 120 6
           ; input the cell parameters
      @symmetry p3221
           ; input symmetry operators
      mmir 4. 116

Expand the current selection to the entire residue(s) or the entire molecule (ie. entire chain).

Function: Selection

Syntax:
1) MORE [i0 [i1]]
2) MORE CHAIN

Note:
1) If there is any atom in the i-th residue that is currently ON, MORE command will turn every atoms in the zone from position (i+i0)th to position (i+i1)th to ON. The default i0 is 0, and the default i1 is i0.
2) MORE works in a positive way only. For example, if an atom in a residue is currently ON, MORE command will turn every atom in that residue ON, regardless of the status of the INCLUDE/EXCLUDE switch.
3) The CHAIN option will expand ON status from a single atom to the entire molecule.

Examples:
1) Select the protein molecule only, ie. all residues that contain Ca atoms.

      initial
      ca
      more
  or
      { ca \ more}

2) Select all residues that contain atoms within 4.0 Å from residue 99.

      initial
      naybr 4.0 99    ; select atoms 
      more            ; expand to residues
  or
      { naybr 4.0 99 \ more}

3) Select all tripeptides that have a Gly as the middle residue.

      initial
      residue Gly
      more -1 1

NAYB

Select neighboring atoms of a specified atom. (NAYB is an odd word borrowed from the program FRODO).

Function: Calculation, Selection

Syntax:
1) NAYB radius [res_id [atom_name]]
2) NAYB radius CENTER x, y, z

NAYBR

Select neighboring atoms of a specified residue.

Function: Calculation, Selection

Syntax:
NAYBR radius res_id

Note:
No crystallographic symmetry information is required or used.

PATTERN

Select groups of records that fit with a given pattern in their text string.

Function: Selection

Syntax:
load grp_id | PATTERN pattern_string [position]

Note:
1) The pattern_string is a text string to be searched for. One letter of the pattern_string is searched from one record. For example, if the single letter code of a.a. is enbedded in the text string of records of Ca atoms, one can use this command to find the position of the a.a. sequence 'where' in the protein, and to select the five records of the corresponding Ca atoms. '%' is used as a one-letter wild card, which can be changed with the SETENV command.
2) The postion is an integer. It tells where to look for the letter that composes the pattern. The default position is the chain mark position (15).

RESIDUE

Select specific type(s) of residues.

Function: Selection, Information

Syntax:
1) RESIDUE
2) RESIDUE res_type1 [res_type2 ... ]
3) RESIDUE EXCEPT res_type1 [res_type2 ... ]

Note:
1) The first form shows a list of residue types. The number enclosed in the brackets are the numbers of selected residues for every residue types.
2) RESIDUE and ATOM are the only commands where EXCEPT keyword can be used.

SIDE

Select the non-main-chain atoms.

Function: Selection

Syntax:
SIDE

Note:
1) The main chain atoms are defined with DFMAIN command.
2) Non main chain atoms often include solvent molecules.

SWAP

Set ON atoms to OFF and vice versa or switch the ON atoms with those in a given group.

Function: Selection

Syntax:
1) SWAP
2) SWAP group_id

Note:
1) In the first form, all the ON records will be switched to OFF, and vice verse. By definition, the two sets of records have no overlap.
2) In the second form, all the records in the specified group (if there is any) will be turned ON; and the specified group will be redefined as the previous ON records (if there is any). A previous ON record will be turned OFF if it was not included in the specified group previously. An overlap between the set of the ON records and the set of the grouped records is allowed.

TEXT

Select records which contains a given text-string.

Function: Selection

Syntax:
TEXT text_string [t1, t2]

Note:
t1 and t2 specify the column number in the displayed text string between which the given string will be searched. The default is to search the entire displayed text string.

W

Select atoms which have W value (weight or occupancy) greater, less than, between or beyond the given value(s).

Function: Selection

Syntax:
W (<, >, <>) cutoff(s) [ANGLE]

Note: If the ANGLE option is selected, the values are considered as angles.

X (Y or Z)

Select atoms which have X (Y or Z) values (in the displayed text) greater, less than, between or beyond the given value(s).

Function: Selection

Syntax:
X (<, >, <> and ><) cutoff(s) [ANGLE]

Note: If the ANGLE option is selected, the values are considered as angles.

ZONE

Select residue zone(s) or show the zones

Function: Selection, Information

Syntax:
1) ZONE
2) ZONE res_id1 [res_id2 ... res_id16]

Note:
1) The first form shows the zone information. In the output, the number enclosed in the brackets is the number of currently selected records in the corresponding zone.
2) The res_idn can be a simple residue_id, a relative residue_id or a complex residue_id,or a range. This provides flexibility for writing macros.

a) A simple residue_id is a chain name (which can be a blank) immediately followed by the residue number. The simple res_id will be used as the register-zero for the next relative res_id if there is one.
b) A relative residue_id is a character `+' followed by an integer. It represents a residue separated from the register-zero by the given integer number of residues. The initial register- zero is the position before the first residue.
c) A complex residue_id is a chain name immediately followed by a relative residue_id. An underscore can be used for a chain which have a blank chain name. The corresponding residue_id is the one represented with the same text string but without the `+', and the register is adjusted so that the selected residue is the consistent with the relative residue_id in the text string.
d) A range contains two residue Ids separated with a hyphen `- ', or ` TO '.

3) A keyword first stands for the first residue, and a keyword last strands for the last residue. A keyword ALL is a shortcut of first - last.