Dimer Classification Tutorial

Our lab has developed a new approach for distinguishing between biological and crystallographic (i.e. false) dimers in X-ray structures of proteins. We have trained our approach on a combined set of biologically relevant and crystallographic dimers, and it is now integrated into ClusPro. The underlying idea of the approach is to re-dock the subunits taken from an X-ray structure and to see how many docking solutions end up close to the original subunit orientation.

Our approach is an alternative to the popular PISA and EPPIC methods and can be used to make a consensus decision when PISA and EPPIC results disagree. It may also be helpful when PISA provides an uncertain result.

  1. Before starting, it is important to keep in mind that the interface of interest may not necessarily be contained within a single asymmetric unit of crystal structure. Very often biologically relevant assemblies can be spread accross several contacting asymmetric units (more on that here). In such cases you will need to prepare a custom pdb file which will contain the target interface entirely. You can do this, for example, by generating symmetry mates of an asymmetric unit in PyMol or other biomolecular modeling package.
  2. To use the method, you will need to specify the PDB structure of interest (either by using a 4-character PDB idenifier or by uploading your own PDB file) and choose a pair of protein chains in it that will be tested for forming a biological dimer. Here as an example we use chains A and B from PDB 12AS, which were experimentally shown to form a functional dimer.
  3. Once the job is complete, you can see the results by clicking on your job ID in the "Results" tab. You will be presented with classification of the dimer ("Likely Biological" or "Likely Crystallographic"). You will also see the Near-native count - the number of docking solutions that arrange the subunits similarly to their original orientation in PDB. It is this number that underlies the classification decision. The bigger this number is, the higher is the probability that a dimer is biologically relevant. The actual probability value provided in the next line is calculated based on the performance of the method on a training set.
  4. For your reference, the results page contains a chart that shows for different Near-native counts what percentage of cases in the training set were biological dimers. Dashed green line is a fitting curve. Vertical red line corresponds to the near-native count for user-submitted dimer. Their intersection gives the probability that a user-submitted dimer is biologically relevant.
  5. In the case of crystallographic dimer the Near-native count is typically much smaller, which results into classification as "Likely Crystallographic". Shown in the figure below is classification result for 1AW7 chains C and D, the contact between which is known to be an artifact of crystallization.

If you have any questions, please look to see if it is addressed on the help page. If you have any suggestions, please contact us.

ClusPro should only be used for noncommercial purposes.
Vajda Lab and ABC Group
Boston University and Stony Brook University