THE DATASET OF PROTEIN-PROTEIN INTERFACES
The two-chain protein interface in our definition was composed of
interacting residues and nearby residues, respectively. The interfaced
residues was picked up first. If the distance of any two atoms between
residues is less than their sum of van der Waals radii plus 0.5 Angstrom,
both residues were registered as the interfaced residues. When assigned
interface residues was less than 10 residues, an arbitrary but reasonable
number to reflect the minimum requirement of contact, the interface was
considered as a result of crystal packing force. Therefore, it would not be
considered further. To enable illuminating the types of architectures at
the interface, residues whose alpha carbon atom is within a distance of
6.0 Angstrom from an alpha carbon atom of previous assigned interface
residues, are included and named nearby residue. The 6.0 Angstrom
selected from trial and error is very close to the lowest distance to
include residues not involving in interface but essential for
demonstrating the scaffold of the interface.
In this study, we have
surveyed exhaustively the structures of protein-protein interfaces in PDB to
carry out the structural comparisons of the interfaces. This works is an
extension of the Nussinov and coworkers study (Tsai et al, 1996)
We have started the
generation of the dataset by extracting the interfaces between the chains from
the PDB crystallographic coordinates. As of July 18, 2002; there were 18,687
entries in the PDB which included 35,112 single chains including all individual
chains in dimers, trimers and so on. The dataset of interfaces contains 21,704
two-chain interfaces.
Fig. below shows an
example for interfaces between three chains of the protein GLUTATHIONE S-TRANSFERASE. The PDB code is 1gwc. Here the chains
AB and BC have contacting residues whereas chains A and C are not close enough to form an interface. So, there are two interfaces. In the BC interface, the magenta are the contacting residues and cyan are the nearby residues. In the AB interface, red residues are the interacting residues and
the yellow residues near the red residues are the neighboring (nearby) residues.
The sidechains of the interacting residues are also displayed in the figure.
The Dataset and an example
Results of the clusters at the final stage
Results of the clusters at different levels
Results of the nonredundant clusters