CONVERT: Inferring the Sequence
These statements are used to define the assumptions about the connectivity implicit in other file formats. They are required for Convert to construct a TNT sequence file. A file containing the most common assumptions is supplied with the TNT distribution. This file is named $tntdata/connect.dat. It should be INCLUDEd prior to the inclusion of any foreign coordinate file, such as a PDB or DSN2 file, if a sequence file is to be PUNCHED. (This is done in the shell command from_pdb.)
The basic idea is that collections of residue types can be treated identically. A collection of these residue types is defined on a CATEGORY statement. For example, in terms of connectivity all amino acids are considered the same. Therefore we define a category of AMINOACID to consist of the twenty amino acid types (some types have synonymous names) with the statement
We then specify what type of link should be generated when an AMINOACID follows another AMINOACID with a CONNECT statement. This statement looks like
CONNECT AMINOACID AMINOACID PEPTIDE
If an amino acid follows an amino acid they should be connected with a PEPTIDE linkage.
Finally we define with a DANGLING statement what to do when a residue of a particular category is followed by nothing. A dangling amino acid should be connected to an empty residue with a BREAK linkage. This requirement is translated to
DANGLING AMINOACID BREAK
CATEGORY <Category name> N(<Residue type>)
This statement is used to place a group of residue types into a category. All the residues in a category will be treated the same when the connectivity of a molecule is deduced.
This example defines the residue types ADE, GUA, CYT, and THY to be in the category BASE.
CONNECT <Category name> <Category name> <Linkage type>
The CONNECT statement says that whenever a residue of the first category is followed by one of the second, they are to be linked with the given linkage type. This mechanism will not add any secondary linkages, like disulfide bonds, but can build a basic peptide backbone.
This example says that when a BASE follows a BASE they should be joined by a dSUGARPHOS linkage. Of course, you would want to say something different if your molecule contained RNA instead of DNA.
DANGLING <Category name> <Linkage type>
The DANGLING statement is used when the residues of a particular require a link even when they are not followed by anything. Both amino acids and nucleic acids have this requirement. When the sequence file is being created and a dangling residue is discovered a dummy residue is manufactured and the residue is linked to the dummy via the linkage type given on the DANGLING statement.
This example says that the terminal BASE should be connected to a dummy residue with a d3'END linkage.