One of the most distinctive features of 3decision is its nearly complete indexing of known and putative binding sites. 3decision currently contains a functionality allowing you to search for similar protein binding sites or subparts of these. This allows to quickly find structurally similar local environments and get information on possible small molecule / peptide fragments that might bind to similar locations. It can also indicate potential off-targets for example that one needs to monitor during compound design cycles within a project.
Nearly 100% of the literature focuses on comparing full binding sites (so a whole pocket). On the contrary, the algorithm and data-structures integrated in 3decision allow to run sub-pocket comparisons, i.e. comparisons of bits and pieces of pockets. This is algorithmically more complex and no evaluation or benchmark exists yet for such use-cases in literature. You can however run full binding site comparisons as well.
Most of the methods and benchmarks are set up in a way that one has to run one by one comparison of binding sites. For example, is my ATP binding site of 4CWR similar to my ATP binding site of 4GFO (both, different proteins)? In general you would get back a comparison score (similarity) and then you can rank all comparisons and get performance measures like TPR vs FPR, precision-recall curves, F1 measures (if you want to assess scores).
This is not how 3decision's pocket search is set up. 3decision's pocket comparison is set up from a database search / retrieval perspective, so given a query pocket or subpocket, it will allow you to retrieve all pockets and subpockets that match that query given a set of parameters for the search.
Schematic representation of the feature detection and definition process on a tyrosine. Physico-chemical features on residues lining the binding site are identified and it is assessed if they contribute to the accessible surface area of the pocket. The Cα position is tracked and the presence and absence of a feature is tracked in a 6 bit binary vector, each bit representing one possible feature.
Searching for a binding site starts with a set of residues (features) defined by the user on a particular binding site of a particular structure. The main search has now been ported nearly entirely to an SQL query. It roughly goes through the following steps:
There are two ways to access Pocket Similarity Search in 3decision. To run a pocket comparison, your structure of interest has to be loaded in the 3D viewer first.
2. On top of the 3D viewer, near SELECTION, you can click on the little arrow. It will open a menu with "Find similar pockets..."
Once the panel open, you have the choice between
You can click directly in the 3D viewer, or use the shortcut Alt+left-click on a ligand atom to select residues. They will appear in the section Residues.
Once in the list, you can deselect one residue by clicking on the blue square.
To define the parameters, two modes are available.
Either there is no further information to be found, or your search parameters are too stringent.
Try to loosen the local angle threshold or switch to number of residue matches > 3 or 4.
This should not happen, please reach out to 3decision-support@discngine.com with the parameters of your search so we can fix this.
Not yet, but we are planning on integrating a filter so you can focus only on different biomolecular structure or those that are different to your query structure.
In order to show that our integration of subpocket searches works in the context of general pocket comparison, one obvious thing to test is whether more or less the same binding site on the same protein, but in different structures can be found if there are conformational changes or not.
One could easily prepare a set of structures of good quality on the same biomolecule sequence, encompassing the same domain and binding site overall. This has been done already to a small extent by the paper cited above, where Dataset 1 (https://journals.plos.org/ploscompbiol/article/file?type=supplementary&id=10.1371/journal.pcbi.1006483.s010) is a list of PDB structures, the chain and ligand code to identify the binding site they were working on. The computer digestable file looks a bit like this:
The following tests aim to show you the impact of some of the parameters exposed in the 3decision front-end to the final results. These are preliminary benchmark results and performance & parameter sets will likely evolve in future releases.
The first test we ran is for each pocket in the dataset. :
This was done with two different parameter sets. On the left hand side one more stringent one, on the right hand side a more relaxed one.
What we want to achieve here is, to have every box at a 100% true positive rate ideally. So for instance for HSP90 (HS90_HUMAN), if you take whatever binding site from whatever structure you are sure to retrieve nearly all of the others in the database.
The parameter set on the right as an example was:
By raising fuzziness you will likely retrieve more (many more) hits, but you'll also increase possible noise.
On top of scientific accuracy, you typically want results as quickly as possible. So assessing how fast the search is, is also important. A full binding site search from 3decision will probably be a bit slower than a subpocket search (more heavy lifting on the database side). The more fuzzy you put your parameters, the more results you'll retrieve and have to postprocess. This will also impact performance. For typical full pocket search, 3decision's pocket comparison should take around a minute and for subpocket searches usually less.