Virtual screening with VEGA ZZ and GriDock
1 Introduction
2 What's you need
3 NAMD installation
4 Download of the target protein
5 Protein preparation
6 Creation of the input files for GriDock/AutoDock
7 Databases
8 Evaluation of the starting complex
9 Running the screening
10 Results
The structure-based virtual screening is a very interesting approach to find new hit compounds from a database of 3D molecules. In this tutorial, it wille explained how to prepare the input files required by GriDock to find potential inhibitors of the HIV-1 protease.
If you don't have already installed NAMD, follow these steps, otherwise skip this section.
4 Download of the target protein
You can download the HIV-1 protease structure (1HPV) through the PDB Web interface or the tool integrated in VEGA ZZ:
It may be possible that the hydrogens are incorrectly added to the co-crystallized ligand due to the protein-specific algorithm and the unusual geometry that the ligand could assume in the binding pocket.
To optimize the structure that we completed adding the hydrogens, atomic charges and the atom potentials must be assigned.
Now we are ready to run the NAMD minimization, but in order to preserve the starting experimental structure, we need to constraint the protein backbone which coordinates will be kept fixed during the calculation.
6 Creation of the input files for GriDock/AutoDock
The crystallization water molecules aren't need and can create problems because they are considered fixed and can't be moved by the ligand during the docking calculation.
In order to generate complexes in which the ligand is placed in the same pocket of the co-crystallized one, you need to select the atoms included in a 10 Å sphere around the binding site.
To run virtual screenings with GriDock, the receptor structure must be pre-processed assigning the AMBER atom types, fixing the atom charges, removing the apolar hydrogens and saving the molecule in PDBQT format. All these steps are automatically performed by Docking\AutoDock\Receptor.c script.
GriDock requires one or more databases that must manageable by VEGA (SDF or Zip format) and they must contain the 3D structures of the molecules that you want to screen. You can build your own databases using the tools included in VEGA ZZ, you can convert a database from 2D to 3D through the Database 2D to 3D.c script or you can download one from Web sites as Ligand.Info: Small-Molecule Meta-Database (http://ligand.info). In this tutorial, a small database will be downloaded from the that Web site.
If you want to check the database, you can open it by VEGA ZZ:
8 Evaluation of the starting complex
In order to compare the screening results with the co-crystallized ligand, it may be interesting to evaluate the complex interaction energy. As first step, you need to extract the ligand from the crystal structure.
WARNING:
Don't normalize the ligand atom coordinates because you need to preserve the
starting position.
To start the screening on a Windows system:
gridock -t score.dpf 1HPV.pdbqt Ligand.pdbqtthe -t option is used to select another input template file for AutoDock 4. The score.dpf template keeps the starting position and conformation of the ligand and evaluate the interaction energy. If you need more information about GriDock, please read the manual.
This is an example of the log output:
10:31:39 INIT: GriDock 1.0.0.20 started on Windows 10:31:39 INIT: Local time Mon, 02 Feb 2009 11:31:39 10:31:39 INIT: Cpu model: AMD Opteron(tm) Processor 250 10:31:39 INIT: CPUs/Cores detected: 2 10:31:39 INIT: CPUs/Cores used: 1 10:31:39 INIT: AutoDock/VEGA directory: "D:\Documenti\Lcc\Vega" 10:31:39 INIT: AutoDock executable: "D:\Documenti\Lcc\Vega\AutoDock4.exe" 10:31:39 INIT: VEGA executable: "D:\Documenti\Lcc\Vega\Vega.exe" 10:31:39 INIT: Receptor file: "1HPV.pdbqt" 10:31:39 INIT: Database file: "Ligand.pdbqt" 10:31:39 INIT: AutoDock template file: "score.dpf" 10:31:39 INIT: AutoDock output archive: "1HPV-Ligand_01.zip" 10:31:39 INIT: Max. size of AutoDock output archive: 4000000000 bytes 10:31:39 INIT: Energy output file: "1HPV-Ligand.csv" 10:31:39 INIT: Temporary file directory: "C:\DOCUME~1\ALESSA~1\IMPOST~1\Temp" 10:31:39 INIT: First molecule to dock: 1 10:31:39 INIT: Input database in PDBQT format: one molecule only will be docked 10:31:39 INIT: AMMP time-out: 120 sec. 10:31:39 INIT: AutoDock time-out: 12000 sec. 10:31:39 INIT: VEGA time-out: 120 sec. 10:31:39 INFO: Starting AutoDock - Molecule 1 (Ligand) 10:31:41 INFO: Molecule 1 - Docking finished (0m 2s) 10:31:41 DOCK: Molecule 1 - Best model 1, Best Binding energy = -8.29 kcal/mol, Ki = 838.53 uM 10:31:41 INFO: End of calculation 10:31:41 INFO: Docked molecules 1 10:31:41 INFO: Elapsed time 0h 0m 2s
The DOCK tag (shown in red) contains the information about the binding quality (binding energy and Ki). You can find the same information in the 1HPV_Ligand.csv file, but it's formatted to be managed by a spreadsheet (e.g. Microsoft Excel):
Database; MolID; Pose; Ki; Binding; Intermolecular; VdW + Hbond + Desolv; Electrostatic; Internal; Torsional; Unbound; Molecule name Ligand; 1; 1; 838,53; -8,29; -9,99; -9,34; -0,65; -1,32; 3,02; 0,00; Ligand
If you want check the resulting complex:
Now in the Screening directory, all files required by the screening are present:
gridock 1HPV.pdbqt ChemBank.sdfand hit return. When GriDock starts, the number of the installed CPUs/Cores is detected and used assigning a working thread. The time required to screen all 2,344 molecules contained in the ChemBank.sdf database depends on the computational power of your system. It's possible to check the calculation progress, viewing the log file (gridock_YYYYMMDD.log) with a text editor.
gridock -f 10 -l 200 1HPV.pdbqt ChemBank.sdfthe molecules in the range from 10 to 200 will be screened only. If you want start from the first molecule, the -f option can be omitted.
gridock -l 200 1HPV.pdbqt ChemBank.sdf
For time reasons, you can screen the first 200 molecules in the ChemBank database, running GriDock with the syntax shown in the previous line.
You can consider ligands as potential HIV-1 protease inhibitors when the
binding energy and/or the binding constants (Ki)
are respectively less than -8.29 kcal/mol and 838,53 uM.
Considering the molecules from 1 to 200 in the ChemBank database and analyzing
the 1HPV-ChemBank.csv file, you can obtain these results:
Database; MolID; Pose; Ki; Binding; Intermolecular; VdW + Hbond + Desolv; Electrostatic; Internal; Torsional; Unbound; Molecule name ChemBank; 123; 9; 72,55; -9,74; -9,74; -9,51; -0,23; 0,00; 0,00; 0,00; Paxilline ChemBank; 101; 6; 282,90; -8,93; -9,83; -9,87; 0,03; -0,51; 0,82; -0,59; 24,25-Dihydroxyvitamin D3 ChemBank; 75; 4; 545,05; -8,54; -8,54; -8,40; -0,15; 0,00; 0,00; 0,00; Grayanotoxin III ChemBank; 151; 7; 793,77; -8,32; -8,84; -8,82; -0,02; -0,30; 0,55; -0,27; Go6976 ChemBank; 17; 4; 796,25; -8,32; -9,10; -9,04; -0,06; -0,82; 1,10; -0,50; WIN 55, 212-2 ChemBank; 97; 4; 1,06; -8,15; -8,88; -8,88; -0,01; -0,68; 0,82; -0,59; 25-Hydroxyvitamin D3 ChemBank; 133; 4; 1,21; -8,07; -8,71; -7,72; -0,99; -0,33; 0,55; -0,42; TTNPB ...
The 1HPV-ChemBank.csv output file can be also analyzed in easy way opening it by Microsoft Excel or any other spreadsheet software able to manage the comma separated value files (CSV format). The Windows version of GriDock detects the locale settings and writes the output with the required decimal separator in order to allow to open the file directly without translations.
In red are shown complexes with binding energy lower -8,29 kcal/mol and they could be potential HIV-1 protease inhibitors. Please remember that the results are reported for exemplificative purposes and with most probability they are unusable in the real biological environment (e.g. Paxilline is a toxin produced by Penicillum paxilli, Grayanotoxin III is toxic diterpenoids, etc).
WARNING:
Due to the stochastic nature of the genetic algorithms implemented in AutoDock
4, the results that you can obtain, could differ from the values reported in this
tutorial.