Teaching module

Introduction.

This chapter holds the text used by the teaching module. This chapter has one sub-chapter for the following aspects of proteins and their environment:
1) General protein structure analysis
2) Problem 1: HIV-1 protease inhibitors
3) More examples are planned, ...
To use the teaching module, proceed as follows:

Start WHAT IF.

Execute once the general command TEACH to copy all files from the `teach` directory (this is a subdirectory of the directory in which the WHAT IF executable resides) to your local directory. After that, type

FULLST Y
to get out of WHAT IF again. (Dont worry about not understanding what you just did, it only needs to be done once, and it has nothing to do with science). Now you are ready to start running the teaching scripts.

Run the teaching scripts. Every script has a code name that is given in brackets at the end of the chapter heading (E.g. AA0001, BB0105, etc). So,

whatif @AA0001
Runs the first script on SG machines. Under PCDOS you first start WHAT IF and use the script command like:
script AA0001

At the end of every script, WHAT IF terminates. So, for every script you have to start WHAT IF again.

Division of scripts in classes

The scripts have a nomenclature that makes it easy to group them in families that logically belong together. It is envisaged that several hundreds of scripts will be available in due time to give courses to students reaching in level from highschool pupils till grad students biology and biophysics. A teacher than only has to put a series of script numbers in a row, and check the answers to the questions.

Script names consist of 2 characters and two times two digits. These 3 groups generate three hierarchial layers in the scripts. The first two characters indicate:

AA   General
BB   Problem 1 HIV-1 protease inhibitors
The further division follows the lines of the table below:
AA   GENERAL

     00 Visualisation

        01 Shows some display modes using Hypothase 
           (all, C-alpha, H-bonds, Sec.Str., Acc.Surf.)
        02 Shows some colouring schemes for the display of 
           Hypothase residues
        03 Emphasize that colouring schemes can identify
           differences in protein structure
        04 Shows a Ramachandran plot of Hypothase
        05 Shows contacts in Crambin by dashed lines
        06 Shows contact plot of Crambin
        07 Shows surface plots of HLA
        08 Structural superposition (Hemoglobin A and B)
        09 Shows some colouring schemes for the display of
           Hypothase residues (backbone, side-chains)
        10 Shows surface plot of Hypothase
(The AA 00 ** group holds the scripts used at the EMBL for a one day course for first year PhD students)
     01 Primary structure
        
        01 Primary structure of Hypothase
        02 Classes of amino acids (example: Hypothase)

     02 Secondary structure

        01 Shows secondary structure (alpha helix) example
        02 Shows secondary structure (anti-parallel beta sheet)
           example
        03 Shows secondary structure (alpha helix) example
           including hydrogen bonding pattern
        04 Shows secondary structure (anti-parallel beta sheet) 
           example, including hydrogen bonding pattern
        05 Shows secondary structure (parallel beta sheet) example
        06 Shows secondary structure (parallel beta sheet) example
           including hydrogen bonding pattern
        07 Ramachandran plot of Hypothase

     03 Tertiary structure

        01 Shows combination of secondary structures (TIM)
           including hydrogen bonding pattern and ribbons
        02 Shows the structure of HIV-1 Protease sub-unit A
           including hydrogen bonding pattern and ribbons
        03 Shows the structure of Hypothase including hydrogen
           bonding pattern and ribbons
        04 Example of anti-parallel beta strands in the domains of 
           Aspartate Transcarbamylase
        05 Example of parallel beta strands in Flavodoxin 
        06 Example of (anti)parallel beta barrel in Plastocyanin
        07 Example of the helix-loop-helix motif in Calmodulin
        08 Example of the hairpin-beta motif in Erabutoxin
        09 Example of the beta-alpha-beta-alpha motif in TIM
        10 Example of the helix-bundle motif in Myohemerythrin

     04 Quarternary structure

        01 Shows the complete structure of Triose Phosphate Isomerase
           (TIM) including hydrogen bonding pattern and ribbons
        02 Shows the complete structure of HIV-1 Protease
           including hydrogen bonding pattern and ribbons
(The AA 00 01-04 group can be incorporated in any other set of scripts because they are very simple and very clear examples of elementary aspects of protein structure).
BB   PROBLEM 1 HIV-1 PROTEASE INHIBITORS

     00 Introduction
     
     01 Structure of HIV-1 Protease

        01 Primary structure    (sub-unit A)
        02 Tertiary structure   (sub-unit A) (Backbone & C-alpha & H-bonds)
        03 Tertiary structure   (sub-unit A) (H-bonds & Ribbons)
        04 Quaternary structure (sub-units A & B) (colours)
        05 Quaternary structure (sub-units A & B) (Ribbons)
        06 Shows tertiary structure of HIV-1 Protease with superposed
           sub-units A & B
        07 Contacts between HIV-1 Protease sub-units 
        08 Plot of the contacts between the HIV-1 Protease sub-units
 
     02 Homology of HIV-1 Protease with other structures

        00 Introduction
        01 Shows homology of HIV-1 Protease with Rouse Sarcoma virus
        02 Shows homology of HIV-1 Protease with Penicillopepsin
        03 Shows homology of HIV-1 Protease with Endothiapepsin
        04 Shows homology of HIV-1 Protease with Rhizopuspepsin
        05 Shows homology of Endothiapepsin with Penicillopepsin
        06 Shows homology of Rhizopuspepsin with Endothiapepsin
        07 Shows homology of HIV-1 Protease with Pepsin
        
     03 Inhibition of HIV-1 Protease

        01 Active site of HIV-1 Protease
        02 Example of a HIV-1 Protease inhibitor (A74704) and its
           interaction with both sub-unit A & B 
        03 Contacts of inhibitor A-74704 with HIV-1 Protease active site
(The BB 00 00-03 group is a two-three day course for third year undergraduate students at the University of Leiden. It is probably better to go through one or two days of AA scripts before starting the BB scripts. The BB scripts require sequence databases, Mosaic ot Netscape WWW browsing, etc., as support).

GENERAL

Introduction.

This is the part in the teaching module that describes elementary visualisation and structure aspects of proteins. Scripts are available for the following teaching elements:

1) Visualisation
2) Primary structure
3) Secondary structure
4) Tertiary structure
5) Quarternary structure

Visualisation

Display modes using Hypothase (AA0001)

The molecule Hypothase will be read in and put up at the screen. A number of graphical objects is created (presented at the bottom of the screen as a set of boxes labeled MOL1 till MOL6) which contain the following graphical representations of the protein:

MOL1 :  the whole molecule, coloured by atom type,
MOL2 :  the hydrogen bonds present,
MOL3 :  the molecular surface, represented by dots,
MOL4 :  an alpha carbon trace,
MOL5 :  a backbone only trace,
MOL6 :  a ribbon representation.
Questions:

1) Try to find the helix and the 3 strands.

2) Which hydrogen bond patterns govern these elements?

3) Why are these hydrogen bonds so much more irregular than expected?

Colouring schemes for Hypothase residues (AA0002)

The molecule Hypothase will be read in and put up at the screen. The following colours are used:

RED      :  for acidic amino acids (glutamic and aspartic acid)
BLUE     :  for basic amino acids (arginine, lysine, histidine)
PURPLE   :  for polar residues (glutamine, asparagine)
YELLOW   :  for sulphur containing amino acids (cysteine)
GREEN    :  for hydrophobic residues (leucine, valine, tryptophan, etc.)
GREENISH :  for alcoholic residues (threonine, serine, tyrosine)
The following graphical objects are available:

MOL1 :  the whole molecule coloured by residue type
MOL2 :  as MOL1, but with the backbone reduced to a C-alpha trace
MOL3 :  as MOL1, but with the backbone reduced to a backbone trace
MOL4 :  a ribbon representation
Questions:

1) Are the polar and a-polar residues distributed randomly over the structure?

2) If not, derive a rule for this distribution.

3) Does this molecule have a 'hydrophobic core'?

Colouring schemes identifies aspects of proteins (AA0003)

The core of a protein normally is hydrophobic, but the surface holds all kinds of residues. With the aid of colours the function of protein structures can be clarified and explained. The membrane protein Ridase will be displayed:

MOL1 :  the whole molecule coloured by residue type
MOL2 :  as MOL1, but with the backbone reduced to a C-alpha trace
MOL3 :  as MOL1, but with the backbone reduced to a backbone trace
MOL4 :  a ribbon representation
Questions:

1) Now that we know how residues are distributed over the protein, see how much of that rule holds in this molecule.

2) Any idea where the differences come from?

3) Does this molecule have a 'hydrophobic core'?

Ramachandran plot of Hypothase (AA0004)

A protein can only fold if the torsion angles in the backbone can vary from residue to residue. Two backbone angles are important: Phi and Psi. A plot of Phi against Psi can be useful to get a quick impression of a protein molecule. Because of steric hindrance not every combination of Phi & Psi torsion angles is possible. The allowed regions are indicated. The Phi/Psi plot of the protein Hypothase will be displayed:

MOL1 :  a Phi-Psi plot of Hypothase (click on a data point to identify 
        the corresponding residue).
Questions:

1) Try to find out which parts of the molecule end up where in this plot.

2) Three residues fall wide outside the 'boxed' areas. Why?

Contacts in Crambin (AA0005)

Proteins only fold if the side chains that come together make favourable contacts. Drawing contact-lines is an often used way of looking at contact-patterns in proteins. The protein Crambin will be displayed:

MOL1 :  the whole molecule, coloured by atom type,
MOL2 :  dotted lines between contacting atoms,
MOL3 :  dots that indicate the Van der Waals surface.
Questions:

1) Which residues are very important for the folding of this molecule?

2) Which residues are 'totally useless' for Crambin?

3) Why are those 'useless' residues nevertheless present?

Contact plot of Crambin (AA0006)

Proteins only fold if the side chains that come together make favourable contacts. A contact plot is a sophisticated way of looking at contact patterns in proteins. The contact plot of the protein Crambin will be displayed:

MOL1 :  a contact plot (click on the lower left corner of a square
to identify the corresponding contacting residues).
Questions:

1) Why is the diagonal of this plot sometimes wide, and sometimes narrow?

2) Can you find back the 'usefull' residues detected in the previous exercise?

Surface plots of HLA (AA0007)

The structure of a protein often reveals a lot about its function. A good example is HLA. This molecule presents foreign antigens to the immune system (presents oligo peptides to the helper T-cells). The surface of HLA can be represented with a mesh pattern:

MOL1 :  the whole molecule,
MOL2 :  a very low resolution surface plot,
MOL3 :  a ribbon representation.
Questions:

1) How many independend domains can you find in HLA?

2) Any idea where the oligo peptide probably sits in HLA?

Structural superposition of Hemoglobin A and B (AA0008)

The sequence alignment of the hemoglobin A and B chain will (almost independent of the alignment program used) roughly look like:

 1     - V L S P A D K T N V K A A W G K V G A H A G E Y G A E A L   30
       V H L T P E E K S A V T A L W G K V - - N V D E V G G E A L

 31    E R M F L S F P T T K T Y F P H F - D L S H - - - - - G S A   60
       G R L L V V Y P W T Q R F F E S F G D L S T P D A V M G N P

 61    Q V K G H G K K V A D A L T N A V A H V D D M P N A L S A L   90
       K V K A H G K K V L G A F S D G L A H L D N L K G T F A T L

 91    S D L H A H K L R V D P V N F K L L S H C L L V T L A A H L  120
       S E L H C D K L H V D P E N F R L L G N V L V C V L A H H F

121    P A E F T P A V H A S L D K F L A S V S T V L T S K Y R *    150
       G K E F T P P V Q A A Y Q K V V A G V A N A L A H K Y H *
Around residue 50 we see a strong identical triplet: DLS. Based on the structures, Asp 49 of the A-chain should be aligned with Gly 48 from the B-chain.

Questions:

1) What is wrong in the above alignment around residue 50?

2) Try to improve the alignment near the n-termini.

Colouring schemes for the display of Hypothase residues (AA0009)

The molecule Hypothase will be read in and put up at the screen. To emphasize the backbone atoms vs. the side-chain atoms, they are both coloured differently. The following colours are used:

PURPLE   :  for the backbone atoms.
GREEN    :  for the side-chain atoms.
The following graphical objects are available:

MOL1 :  the whole molecule coloured by atom type.
MOL2 :  the whole molecule, the backbone and side-chain atoms
        coloured differently.
MOL4 :  Lines indicating atomic contacts
MOL5 :  A ribbon representation
Questions:

1) Try to analyse backbone-backbone, backbone-sidechain and sidechain-sidechain contacts.

2) What are the major differences between these three classes?

Surface plot of Hypothase (AA0010)

The molecule Hypothase will be read in and put up at the screen. A surface mesh will be calculated around the molecule and showed together with the molecule. The calculation of the surface can take a while, be patient. The following graphical objects are available:

MOL1 :  the whole molecule coloured by atom type.
MOL2 :  the surface map of the molecule.
Questions:

1) Does this surface representation agree with the conclusions of AA0002?

2) Would you call the surface 'rather smooth', 'rather rippled', or would you give it another description.

3) Where sits the hypothase active site?

Primary Structure

Primary structure of Hypothase (AA0101)

A protein's polypeptide chain consists out of a number of amino acids. The amino acid sequence is called the primary structure of the protein. 20 amino acids will be shown on the screen. All residues will be labeled.

Questions:

1) Which is the smallest residue?

2) Which is the largest residue?

3) Which are the negatively charged residues?

4) Which are the positively charged residues?

5) Which are the alcoholic residues?

6) Which are the most hydrophobic residues?

7) Which are the most flexible residues?

8) Which are the most rigid residues?

Classes of amino acids (AA0102)

Depending on the chemical nature of the side chain, the amino acids are usually divided into a number of different classes:

acidic amino acids             (Gln, Asp)             RED
basic amino acids              (Arg, Lys, His)        BLUE
polar residues                 (Glu, Asn)             PURPLE
sulphur containing amino acids (Cys, Met)             YELLOW
small hydrophobic residues     (Gly, Ala, Val, Pro)   GREEN
large hydrophobic residues     (Leu, Trp, Phe, Ile)   GREEN
alcoholic residues             (Thr, Ser, Tyr)        GREENISH
Questions:

1) Label at least one residue from each class of residues.

Secondary Structure

Secondary structure example - alpha helix (AA0201)

Different regions of the primary structure (sequence) are able to form regular secondary structures. Alpha helices and beta sheets are such secundary structure elements.

MOL1 :  An example of an alpha helix (poly-A).
MOL2 :  The backbone atoms of the alpha helix.
MOL3 :  The C-alpha atoms of the alpha helix.
Questions:

1) How many residues are there in one helical turn?

Secondary structure example - beta sheet (AA0202)

Different regions of the primary structure (sequence) are able to form regular secondary structures. Alpha helices and beta sheets are such secondary structure elements.

MOL1 :  An example of an anti-parallel beta sheet.
MOL2 :  The backbone atoms of the sheet,
MOL3 :  The C-alpha atoms of the sheet.
Questions:

1) Why do we call this ANTIPARALLEL beta sheet?

Secondary structure example - beta sheet (AA0205)

Different regions of the primary structure (sequence) are able to form regular secondary structures. Alpha helices and beta sheets are such secondary structure elements.

MOL1 :  An example of a parallel beta sheet.
MOL2 :  The backbone atoms of the sheet.
MOL3 :  The C-alpha atoms of the sheet.
Questions:

1) Why do we call this PARALLEL beta sheet?

Secondary structure example - alpha helix (AA0203)

The alpha helix is one of the major elements of secondary structure in proteins. Hydrogen bonds between main chain N and O atoms take care of helix stabilization.

MOL1 :  An example of an alpha helix (poly-A).
MOL2 :  The backbone atoms of the helix.
MOL3 :  The C-alpha atoms of the helix.
MOL4 :  The hydrogen bonds stabilizing the helix.
Questions:

1) Is there any regularity in the hydrogen bonding pattern?

Secondary structure example - beta sheet (AA0204)

Another major element of secondary structure in proteins is the (anti-) parallel beta sheet. As in alpha helices, hydrogen bonds between main chain N and O atoms determine the sheet stabilization.

MOL1 :  An example of an anti-parallel beta sheet.
MOL2 :  The backbone atoms of the sheet.
MOL3 :  The C-alpha atoms of the sheet.
MOL4 :  The hydrogen bonds stabilizing the sheet.
Questions:

1) Is there any regularity in the hydrogen bonding pattern?

Secondary structure example - beta sheet (AA0206)

Another major element of secondary structure in proteins is the (anti-) parallel beta sheet. As in alpha helices, hydrogen bonds between main chain N and O atoms determine the sheet stabilization.

MOL1 :  An example of a parallel beta sheet.
MOL2 :  The backbone atoms of the sheet.
MOL3 :  The C-alpha atoms of the sheet.
MOL4 :  The hydrogen bonds stabilizing the sheet.
Questions:

1) Is there any regularity in the hydrogen bonding pattern?

2) What are the major differences in hydrogen bonding between parallel and antiparallel beta sheets?

Ramachandran plot of Hypothase (AA0207)

A protein can only fold if the torsion angles in the backbone can vary from residue to residue. Two backbone angles are important: Phi and Psi.

  Phi = N-Calpha rotation
  Psi = Calpha-C rotation
A plot of Phi against Psi can be useful to get a quick impression of the secondary structure of a protein molecule. Because of steric hindrance not every combination of Phi & Psi torsion angles is possible. The allowed regions are indicated. The Phi/Psi plot of the protein Hypothase will be displayed:

MOL1 :  a Phi-Psi plot of Hypothase (click on a cross to identify
        the corresponding residue).
Questions:

1) Which areas in this plot correspond to alpha helix and beta strand?

Tertiary Structure

Combination of secondary structures (AA0301)

The structure of Triose Phosphate Isomerase (TIM) holds a number of helices and sheets. A number of possible ways to represent secondary structure elements is shown in the following objects:

MOL1 :  The complete subunit A of TIM.
MOL2 :  The backbone atoms.
MOL3 :  The C-alpha atoms.
MOL4 :  The hydrogen bonds stabilizing helices, sheets and the subunit.
MOL5 :  A ribbon representation of the all secondary structure elements.
Questions:

1) Follow the chain from N to C. Are there any regularities?

The structure of HIV-1 Protease sub-unit A (AA0302)

The complete structure of HIV-1 Protease consists out of two sub-units which have almost equal primary and secondary structure. It holds a number of helices and sheets which are shown in the following objects:

MOL1 :  The complete sub-unit A of HIV-1 Protease.
MOL2 :  The backbone atoms.
MOL3 :  The C-alpha atoms.
MOL4 :  The dimer of HIV-1 Protease.
MOL5 :  A ribbon representation of the all secondary structure elements.
Questions:

1) How many secondary structure elements in the A subunit make a contact with the B subunit?

2) Where do you think the amino acids will be located between which this protease cleaves.

The structure of Hypothase (AA0303)

The tertiary structure of Hypothase is shown in the following objects:

MOL1 :  The complete Hypothase.
MOL2 :  The backbone atoms.
MOL3 :  The C-alpha atoms.
MOL4 :  The hydrogen bonds stabilizing helices, sheets and the subunit.
MOL5 :  A ribbon representation of the all secondary structure elements.
Questions:

1) Do you see the parallel and the antiparallel strands?

Example of anti-parallel beta strands (AA0304)

A ribbon representation of the tertiary structure of the enzyme Aspar- tate Transcarbamylase is shown. The beta sheets in this protein are all anti-parallel.

MOL1 :  The complete structure of Aspartate Transcarbamylase.
MOL2 :  A ribbon representation of all the secondary structure elements.

Example of parallel beta strands in Flavodoxin (AA0305)

A ribbon representation of the tertiary structure of the redox protein Flavodoxin is shown. The (crossing) beta sheets in this protein are all parallel.

MOL1 :  The complete structure of Flavodoxin.
MOL2 :  A ribbon representation of all the secondary structure elements.
Questions:

1) Follow the chain from N to C. How are the strands ordered?

Example of (anti)parallel beta barrel in Plastocyanin (AA0306)

A ribbon representation of the tertiary structure of the electron carrier Plastocyanin is shown. The (anti)parallel beta sheets in this protein are organised in the shape of a barrel.

MOL1 :  The complete structure of Plastocyanin.
MOL2 :  A ribbon representation of all the secondary structure elements.

Example of the helix-loop-helix motif in Calmodulin (AA0307)

A ribbon representation of the tertiary structure of the calcium binding protein Calmodulin is shown. The helix-loop-helix arrangement in this molecule is a very common motif in all kinds of proteins.

MOL1 :  The complete structure of Calmodulin.
MOL2 :  A ribbon representation of all the secondary structure elements.

Example of the hairpin-beta motif in Erabutoxin (AA0308)

A ribbon representation of the tertiary structure of the snake venom Erabutoxin is shown. The hairpin-beta arrangement in this molecule is displayed very frequent in all kinds proteins.

MOL1 :  The complete structure of Erabutoxin.
MOL2 :  A ribbon representation of all the secondary structure elements.

Example of the beta-alpha-beta-alpha motif in TIM (AA0309)

A ribbon representation of the tertiary structure of sub-unit A of the enzyme Triose Phosphate Isomerase (TIM) is shown. The beta-alpha-beta-alpha motif in this molecule is one of the more complex arrangements found in proteins.

MOL1 :  The structure of TIM sub-unit A.
MOL2 :  A ribbon representation of all the secondary structure elements.

Example of the helix-bundle motif in Myohemerythrin (AA0310)

A ribbon representation of the tertiary structure of the oxygen binding protein Myohemerythrin is shown. The helix-bundle motif in this molecule is a common arrangement in proteins.

MOL1 :  The structure of Myohemerythrin.
MOL2 :  A ribbon representation of all the secondary structure elements.

Quaternary Structure

The complete structure of Triose Phosphate Isomerase (AA0401)

The complete structure of Triose Phosphate Isomerase (TIM) consists out of two sub-units which have equal primary and secondary structure. It holds a number of helices and sheets which are shown in the following objects:

MOL1 :  The complete structure of Triose Phosphate Isomerase (TIM).
MOL2 :  The backbone atoms.
MOL3 :  The C-alpha atoms.
MOL4 :  The hydrogen bonds stabilizing helices, sheets and the subunit.
MOL5 :  A ribbon representation of the all secondary structure elements.

The complete structure of HIV-1 Protease (AA0402)

The complete structure of HIV-1 Protease consists out of two sub-units which have almost equal primary and secondary structure. It holds a number of helices and sheets which are shown in the following objects:

MOL1 :  The complete structure of HIV-1 Protease.
MOL2 :  The backbone atoms.
MOL3 :  The C-alpha atoms.
MOL4 :  The hydrogen bonds stabilizing helices, sheets and the subunit.
MOL5 :  A ribbon representation of the all secondary structure elements.

Problem 1: HIV-1 protease inhibitors

Introduction.


1) Structure of HIV-1 Protease
2) Homology of HIV-1 Protease with other structures
3) Inhibition of HIV-1 Protease

Structure of HIV-1 Protease

Introduction (BB0000)

The HIV-1 retrovirus is the causative agent of AIDS. In the course of viral replication, many retroviral structural proteins and enzymes are initially translated as polyproteins and which undergo cleavage to generate the functional proteins found in mature virions. Analysis of the sequences of retroviral proteases learned that these enzymes were members of the aspartyl protease family, on the basis of the observed conservation of a characteristic Asp-Thr-Gly active site sequence.

HIV-1 protease, with only 99 amino-acid resuidues, is the smallest of the retroviral proteases, and is much smaller than the microbial and mammalian aspartyl proteases, each of which contains approximately 325 residues.

Primary structure sub-unit A (BB0101)

The primary structure of sub-unit A of HIV-1 Protease is shown below.

   1-50   PQITLWQRPL VTIKIGGQLK EALLDTGADD TVLEEMSLPG RWKPKMIGGI
  51-99   GGFIKVRQYD QILIEICGHK AIGTVLVGPT PVNIIGRNLL TQIGCTLNF
A representation of sub-unit A of HIV-1 Protease will be shown on the screen.

Tertiary structure sub-unit A (BB0102)

The tertiary structure of sub-unit A of HIV-1 Protease will be shown on the screen. A number of secondary structure motifs is present in this protein of which the large anti-parallel beta sheet is the most eminent.

MOL1 :  Sub-unit A of HIV-1 Protease.
MOL2 :  The backbone of sub-unit A.
MOL3 :  The C-alpha trace of sub-unit A.
MOL4 :  The H-bond pattern calculated for sub-unit A.

Tertiary structure sub-unit A (BB0103)

The tertiary structure of sub-unit A of HIV-1 Protease will be shown on the screen. A number of secondary structure motifs is present in this protein of which the large anti-parallel beta sheet is the most eminent.

MOL1 :  Sub-unit A of HIV-1 Protease.
MOL2 :  The backbone of sub-unit A.
MOL3 :  The C-alpha trace of sub-unit A.
MOL4 :  The H-bond pattern calculated for sub-unit A.
MOL5 :  A ribbon representation of sub-unit A.

Quaternary structure sub-units A & B (BB0104)

The quaternary structure of HIV-1 Protease will be shown on the screen. The symmetric combination of sub-unit A and B will render the molecule active. The dimer exhibits exact crystallographic, twofold rotational (C2) symmetry.

MOL1 :  HIV-1 Protease.
MOL2 :  The sub-units displayed in different colours.
MOL3 :  The backbone.
MOL4 :  The C-alpha trace.
MOL5 :  The calculated H-bond pattern.

Quaternary structure sub-units A & B (BB0105)

The quaternary structure of HIV-1 Protease will be shown on the screen. The symmetric combination of sub-unit A and B will render the molecule active.

MOL1 :  HIV-1 Protease.
MOL2 :  The sub-units displayed in different colours.
MOL3 :  The backbone.
MOL4 :  The C-alpha trace.
MOL5 :  The calculated H-bond pattern.
MOL6 :  Ribbon representation of the molecule.

Superposed sub-units of HIV-1 Protease (BB0106)

To emphasize the fact that both A and B sub-units of HIV-1 Protease are very much alike, a superposition of both units is shown.

MOL1 :  HIV-1 Protease, sub-unit A.
MOL2 :  HIV-1 Protease, sub-unit B.
MOL3 :  The superimposed sub-units A & B.
MOL4 :  Backbone sub-unit A.
MOL5 :  Backbone superposed sub-unit B.
MOL6 :  C-alpha trace sub-unit A.
MOL7 :  C-alpha trace superposed sub-unit B.

Contacts between HIV-1 Protease sub-units (BB0107)

The sub-units A and B of HIV-1 Protease display many contacts which are shown as dotted lines.

MOL1 :  HIV-1 Protease, sub-unit A.
MOL2 :  HIV-1 Protease, sub-unit B.
MOL3 :  Contacts shown as dotted lines.
MOL4 :  HIV-1 Protease.
MOL5 :  Backbone sub-unit A.
MOL6 :  Backbone sub-unit B.
MOL7 :  C-alpha trace sub-unit A and B.

Contacts between the HIV-1 Protease sub-units (BB0108)

The contacts between sub-units A and B of HIV-1 Protease are graphically shown in a contact plot.

MOL1 :  Plot of the contacts calculated between sub-unit A and B.

Homology of HIV-1 Protease with other structures

Introduction (BB0200)

The HIV-1 protease structure was compared to the structures of a number of aspartyl proteases of various origin, for which coordinates were available from the Protein Data Bank. Sequence analysis showed similarity upto 52%. Large regions of the HIV-1 protease, including residues in the protease active site, have structural analogues in the microbial and porcine aspartyl proteases.

Homology of HIV-1 Protease with Rouse Sarcoma virus (BB0201)

Protein sequence analysis has shown that the homology between HIV-1 Protease and the protease from the Rous Sarcoma Virus is relatively high (42% similarity, 20% identity). This example shows the superposed RSVP sub-unit A upon HIV-1 Protease sub-unit A.

MOL1 :  Sub-unit A of Rouse Sarcoma Virus Protease.
MOL2 :  Superposed sub-unit A of HIV-1 Protease.
MOL3 :  C-alpha trace of Rouse Sarcoma Virus Protease.
MOL4 :  C-alpha trace of superposed sub-unit A of HIV-1 Protease.
MOL5 :  The superposed sub-units.

Homology of HIV-1 Protease with Penicillopepsin (BB0202)

Protein sequence analysis has shown that there is homology between HIV-1 Protease and Penicillopepsin (38% similarity, 16% identity).

This example shows the superposed HIV-1 sub-unit A upon Penicillopepsin.


MOL1 :  Penicillopepsin.
MOL2 :  Superposed sub-unit A of HIV-1 Protease.
MOL3 :  C-alpha trace of Penicillopepsin.
MOL4 :  C-alpha trace of superposed sub-unit A of HIV-1 Protease.
MOL5 :  The superposed sub-unit on Penicillopepsin.

Shows homology of HIV-1 Protease with Endothiapepsin (BB0203)

Protein sequence analysis has shown that there is homology between HIV-1 Protease and Endothiapepsin from Chestnut Blight Fungus (40% similarity, 16% identity).

This example shows the superposed HIV-1 sub-unit A upon Endothiapepsin.


MOL1 :  Endothiapepsin.
MOL2 :  Superposed sub-unit A of HIV-1 Protease.
MOL3 :  C-alpha trace of Endothiapepsin.
MOL4 :  C-alpha trace of superposed sub-unit A of HIV-1 Protease.
MOL5 :  The superposed sub-unit on Endothiapepsin.

Homology of HIV-1 Protease with Rhizopuspepsin (BB0204)

Protein sequence analysis has shown that there is homology between HIV-1 Protease and Rhizopuspepsin (43% similarity, 19% identity).

This example shows the superposed HIV-1 sub-unit A upon Rhizopuspepsin.


MOL1 :  Rhizopuspepsin.
MOL2 :  Superposed sub-unit A of HIV-1 Protease.
MOL3 :  C-alpha trace of Rhizopuspepsin.
MOL4 :  C-alpha trace of superposed sub-unit A of HIV-1 Protease.
MOL5 :  The superposed sub-unit on Rhizopuspepsin.

Homology of Endothiapepsin with Penicillopepsin (BB0205)

Protein sequence analysis has shown that there exists large sequence homology between Penicillopepsin and Endothiapepsin (67% similarity, 5% identity).

This example shows the superposed Penicillopepsin upon Endothiapepsin.


MOL1 :  Endothiapepsin.
MOL2 :  Penicillopepsin.
MOL3 :  C-alpha trace of Endothiapepsin.
MOL4 :  C-alpha trace of superposed Penicillopepsin.
MOL5 :  The superposed Penicillopepsin on Endothiapepsin.

Homology of Rhizopuspepsin with Endothiapepsin (BB0206)

Protein sequence analysis has shown that there exists large sequence homology between Rhizopuspepsin and Endothiapepsin (56% similarity, 38% identity).

This example shows the superposed Rhizopuspepsin upon Endothiapepsin.


MOL1 :  Endothiapepsin.
MOL2 :  Rhizopuspepsin.
MOL3 :  C-alpha trace of Endothiapepsin.
MOL4 :  C-alpha trace of superposed Rhizopuspepsin.
MOL5 :  The superposed Rhizopuspepsin on Endothiapepsin.

Homology of HIV-1 Protease with Pepsin (BB0207)

Protein sequence analysis has shown that there exists large sequence homology between sub-unit A of HIV-1 Protease and Pepsin (52% similarity, 27% identity).

This example shows the superposed sub-unit A of HIV-1 Protease upon Pepsin.

Inhibition of HIV-1 Protease

Active site of HIV-1 Protease (BB0301)

The active site of HIV-1 Protease is formed as the result of (symmetric) interactions between sub-units A and B.

By moving the molecule around, one is able to view through a hole in the structure which comprises the active site. As an example the surface area of a co-crystallized inhibitor is displayed to give an idea of the extent of the cavity.


MOL1 :  HIV-1 Protease, sub-unit 1.
MOL2 :  HIV-1 Protease, sub-unit 2.
MOL3 :  Surface map of inhibitor A74704.

Example of a HIV-1 Protease inhibitor A-74704 (BB0302)

There are numerous studies into the inhibition of HIV-1 Protease. A large number of candidate inhibitors and their analogs have been synthesized and tested `in computro' as well as in vitro. An example of such an inhibitor is A-74704, which consists of a core unit (COR), 2 valine residues (VAL) and 2 carbobenzyloxy fragments (CBZ). As the active site of HIV-1 Protease, A-74704 has C2 symmetry.

MOL1 :  HIV-1 Protease, sub-unit A.
MOL2 :  HIV-1 Protease, sub-unit B.
MOL3 :  Inhibitor A74704.
MOL4 :  C-alpha trace sub-unit A.
MOL5 :  C-alpha trace sub-unit B.

Contacts of inhibitor A-74704 with HIV-1 Protease (BB0303)

A number of HIV-1 Protease residues is involved in the binding with A-74704. These sub-sites (atoms that lie within a 4.2 Angstrom radius of any atom on the designated group of the inhibitor) are listed in the attached table. The contacts are visualized accordingly.

MOL1 :  HIV-1 Protease, sub-unit A.
MOL2 :  HIV-1 Protease, sub-unit B.
MOL3 :  Inhibitor A74704.
MOL4 :  C-alpha trace sub-unit A.
MOL5 :  C-alpha trace sub-unit B.
MOL6 :  Contacts for A-74704 with HIV-1 Protease.

 Enzyme    Inhibitor    
Sub-site     Group      Enzyme residues

   S3         CBZ       Gly27-A, Ala28-A, Asp29-A, Asp30-A, Gly48-A, 
                        Met46-A, Ile47-A
   S3'        CBZ'      Gly27-B, Ala28-B, Asp29-B, Asp30-B, Gly48-B, 
                        Arg8-A
   S2         VAL       Ala28-A, Val32-A, Ile47-A, Gly48-A, Gly49-A, 
                        Ile50-A, Ile84-A
   S2'        VAL'      Ala28-B, Val32-B, Ile47-B, Gly48-A, Gly49-B, 
                        Ile50-B, Ile84-A, Asp29-B, Ile50-A
   S1         COR       Leu23-A, Asp25-A, Gly27-A, Ala28-A, Gly49-A, 
                        Ile50-A, Val82-A, Ile84-A
   S1'        COR'      Leu23-B, Asp25-B, Gly27-B, Ala28-B, Gly49-B, 
                        Ile50-B, Val82-B, Ile84-B, Pro81-B, Arg8-A
           Central-OH   Asp25-A, Gly27-A, Ala28-A
                        Asp25-B, Gly27-B
           Buried H2O   Gly49-A, Ile50-A, Gly49-B, Ile50-B
                        VAL, VAL', COR (inhibitor groups)