Families and clusters (CLUFAM)

Introduction.

A family is defined as a group of one or more amino acids consecutively located in the sequence. Families are not something very intelligent or so, it is just a way of giving names to stretches of amino acids. One can for example give all major secondary structure elements their own name. Families can at several stages be used as input for options. It is for example possible to give families a color, or delete all residues in a family.

Commands that are related to usage of families are easily recognized because they have the three letter combination FAM in their name.

A cluster is a group of residues that does not need to sit next to each other in the sequence. In a way clusters are sets of families.

Commands that are related to usage of clusters are easily recognized because they have the three letter combination CLU in their name.

There are no restrictions to residues in families or clusters, they dont need to contact each other, they can be mixed protein, nucleic acid, drugs, ions and water, etc. The only problem is that some options can not accept every cluster as input. E.g. a protein mutation option can not accept as input range a family that contains drugs.

You can enter this menu with the commands CLUFAM, CLUSTER or FAMILY.

The general (is menu independent) option CHOOSE pre-selects a cluster. Every WHAT IF option that accepts clusters if residue input is required will, after using CHOOSE, use the cluster you created with the CHOOSE command instead of prompting you for residues. The NEWRNG command disables the CHOOSE option.

Families

Initialize families (INIFAM)

WHAT IF's entire memory about families can be initialized withe the INIFAM command. This command does not do anything with the amino acids that are in those families, just the information about the groupings will be erased.

List existing families (SHOFAM)

The command SHOFAM can be used to get a listing of all presently available families.

Create a family (SETFAM)

The command SETFAM will cause WHAT IF to prompt you for a residue range and a family name. The range will under that name be stored as a family.

Delete a family (DELFAM)

The command DELFAM will cause WHAT IF to prompt you for a a family name. This family will be removed from the list of families..

Clusters

Initialization of clusters (INICLU)

The command INICLU will cause WHAT IF to erase all information about the grouping of amino acids in clusters from its memory. Nothing is done with the amino acids themselves.

Listing the contents of a cluster (SHOCLU)

The command SHOCLU can be use to inspect the contents of a cluster. You will be prompted for the number of the cluster.

Manually defining clusters (SETCLU)

The command SETCLU will cause WHAT IF to prompt you for ranges of residues (finish input of ranges with a zero as usual), and for a cluster name.

Creating a cluster (HYDCLU)

A hydrophobic cluster is a group of amino acids which is generated by the following set of rules:

1) No amino acid has an accessibility above a certain threshold value (default = 1.0 Angstrom**2).

2) Every amino acid in the cluster is close to at least two other amino acids in the cluster; being close is defined as having at least one pair of atoms within a certain distance from each other (default = 5.0 Angstrom. <BR> The command HYDCLU will cause WHAT IF to prompt you for a residue and a residue range. It will try to generate the largest cluster (see above for definition) in the range with the first given residue in it.

Using clusters and families as input to other options

If WHAT IF prompts you for residues, it always asks you to give:
One residue
One range of residues
Multiple ranges of residues
If you are prompted for one residue, well, that is exactly what you can do: give one residue.

If you are prompted for one range, you can give ranges as usual, but you can also type the name of a family. The range of this family will become the input to the option.

If you are prompted for multiple ranges you can, every time when you are prompted for a range, give a range as usual, give the name of a family, or give the name of a cluster. If you give ranges, families, or clusters with overlapping residues that all residues will be used by the option that are part in at least one of those ranges, families or clusters.

Parameters for clusters and families

The following parameters influence the behaviour of at least one of the cluster or family options:

LIMBUR in the ACCESS parameter menu (LIMBUR)

The LIMBUR parameter in the parameters menu of the ACCESS menu determines the maximal allowed accessible surface in order for a residue to be called buried by the HYDCLU option.

VDWOVR in the ACCESS parameter menu (VDWOVR)

The VDWOVR parameter in the parameters menu of the ACCESS menu determines the maximal allowed distance between the Van der Waals' surfaces of two atoms in order for to be called a contact in the HYDCLU option. (This is the same parameter as VDWDST in the WATER parameter menu).

AA1OR3 in the general parameter menu (A1OR3)

The parameter AA1OR3 in the general parameter menu (you enter that menu with the SETPAR command) determines if the SHOCLU and SHOFAM commands list the residues in 1 or 3 letter code.