13.3 Scripts
13.3.1 Introduction
The VEGA ZZ package includes several scripts, which are placed in ...\VEGA ZZ\Scripts directory with the following sub-directory structure:
This folder contains the templates used when a new script is created. It is not shown in the script three, but it is available selecting Help Explore data directory in the Scripts folder.
OpenGL.c |
Template for OpenGL C scripts. |
Rebol.r | Template for REBOL scripts. |
Stabdard.c | Template for standard C scripts. |
Window API.c | Template window with Close button (Windows API version). |
Window GraphApp.c | Template window with Ok button (GraphApp GUI version). |
Window GraphApp Calc.c | Template window with Calculate button. Clicking this button, the main window hides and the abort dialog is shown. Pressing its Abort button, the calculation is stopped. This script template requires the GarphApp GUI. |
This directory includes scripts for ADMET (adsorption, distribution, metabolism, elimination and toxicity) prediction.
BBB permeation predictor.c |
This script was generated automatically by Tree2C and performs the classification of molecules between permeant and not permeant of blood-brain barrier (BBB) through a decision tree. For more information, click here. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MetaClass builder.c
MetaClass predictor.vll |
MetaClass is a comprehensive classification system for the prediction of the metabolic reactions of a given molecule or of a set of molecules. The prediction is based on a machine-learning algorithm (Random Forest), which was trained by using the metabolic data collected and classified into the MetaQSAR database. MetaClass package includes two modules: MetaClass builder and MetaClass predictor. The former can generate automatically not only the models but also code, which can be used directly in the VEGA ZZ environment for the prediction, while the latter is the result of MetaClass builder run.
Software requirements for MetaClass predictor:
Software requirements for MetaClass builder:
MetaClass predictor The MetaClass predictor is a compiled C-Script (available for both x64 and x86 architectures), which allows the prediction of metabolic reactions which a given molecule undergoes according to MetaQSAR rules. To run it, you must start VEGA ZZ, select File → Run script in the main menu, expand the ADMET branch and double click MetaClass predictor.vll. If a molecule is present in the current workspace, the prediction is performed for a single molecule and the results are shown in the VEGA ZZ console. If the workspace is empty, a file requester is shown to select an input database (it must be in a format supported by VEGA ZZ: Microsoft Access, Mol2, ODBC data source, SDF, SMILES, SQLite and Zip). In this second case, the prediction is performed for all molecules in the database and the results are saved into a CSV file. Since the training set used by the learning phase (the substrates classified in the MetaQSAR database) includes only molecules in non-ionized form, with the exception of quaternary ammonium salts, the molecules for which you want to predict the metabolic reactions must also be in their neutral form. Here is the typical output shown by the VEGA ZZ console for a single molecule prediction (eg. Naproxen):
* Prediction of MetaQSAR metabolic reactions * Assigning atom charges Total charge: 0.00 Reaction Substrate Dom. viol. ====================================================================== 01 - Oxidation of Csp3 Yes 0 02 - Oxidation of Csp2 & Csp Yes 0 03 - -CHOH <-> >C=O -> -COOH Yes 0 04 - Various redox reactions of carbon atoms No 0 05 - Redox reactions of R3N No 0 06 - Oxidation of >NH, >NOH and -N=O // Reduc Yes 0 07 - Oxidation to quinones or analogs // Redu Yes 0 08 - Oxidation and reduction of S atoms No 0 09 - Redox reactions of other atoms Yes 0 11 - Hydrolysis of esters, lactones and inorg Yes 0 12 - Hydrolysis of amides, lactams and peptid No 0 13 - Epoxide hydration No 0 14 - Other hydrolysis/hydration reactions // No 0 21 - O-Glucuronidations & glycosylations No 0 22 - N- and S-Glucuronidations // All other g No 0 23 - Sulfonations (O-, N-, ...) Yes 0 24 - GSH & RSH conjugations + sequels // GSH- No 0 25 - Acetylations & acylations No 0 26 - CoASH-Ligation followed by amino acid co Yes 0 27 - Methylations (O-, N-, S-) No 0 28 - Other conjugations (PO4, CO2, ...) // Tr No 0 For each reaction class (according to the metabolic
reaction classification in MetaQSAR), the output table shows the
code and the description (Reaction column), if the molecule is
substrate or not (Substrate column) and the number of the domain
violations (Dom. viol. column). This counter indicates how many
parameters/attributes are out of the range of the property space of the
training set. If this value is not zero, the prediction might be less
accurate.
MetaClass builder
The MetaClass builder generates both x64 and
x86 versions of the MetaClass predictor only if both VEGA ZZ
x86 and x64 are installed. Usually, only one of the two versions is
installed according to your operating system, but you can override this
behaviour during the VEGA ZZ setup by choosing the installation
of the Live CD creator component.
For each reaction class you can find:
As explained above, DATABASE_NAME - Model
performances.csv includes several statistical data about the
performances of the models built by Weka:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Mutagenicity predictor.c |
This script was generated automatically by Tree2C and performs the classification of molecules between mutagen and not mutagen through a decision tree. For more information, click here. |
The scripts included in this directory, are useful to control some AMMP jobs in automatic way.
Automatic Boltzmann jump.c |
This script performs a conformational analysis of the current molecule
in the workspace by Boltzmann jump algorithm. More in detail, it
generates 1000 conformations at 500 K temperature and each of them is
minimized by conjugate gradients algorithm (3000 steps, 0.01 RMS). The
conformations are automatically saved into a DCD trajectory. After the
conformational search, it is also performed a cluster analysis in order
to discard the redundant conformations and to keep only the most
significant conformers (one for each cluster). In particular, the
conformations whose the differnce of the average value of the flexible
torsion angles is no more than 60 degrees are included in the same
cluster. Two files are automatically generated: a trajectory file (* -
clust.dcd) with the best conformers of each cluster and a text file (* -
clust.ene) with the energy of the best conformer and the number of
conformers of each cluster. All calculation parameters can be set by the user changing the parameters at the beginning of the script source code. For more information on the conformational search, click here. |
Dipole.c |
It calculates the dipole momentum by AMMP. If the charges aren't assigned, they are fixed by Gasteiger - Marsili method (see AMMP's DIPOLE command). |
Interaction analysis.c |
It evaluates the non-bond interaction energy between two molecules. This
calculation requires two molecules in the workspace: the first one must
be the receptor and the second one must be the ligand. For more
information, see ANALYZE
command in AMMP manual. The results are shown in VEGA ZZ console. AMMP shows the energy for each atom in the selection range: Vnonbon internal lys.n 137 Eq -12.860423 E6 -1.397398 E12 2.191076 Vnonbon external lys.n 137 Eq 16.632879 E6 -5.806829 E12 9.177713 Vnonbon total lys.n 137 Eq 3.772456 E6 -7.204227 E12 11.368790 where internal is intramolecular energy, external is the intermolecular (interaction) energy, total is the sum of intramolecular and intermolecular energies, Eq is the electrostatic (coulombic) energy, E6 and E12 are the Lennard - Johnes terms. At the end of the atom dump, AMMP shows also: Vnonbon total internal 151.439880 Vnonbon total external 2.272158 Vnonbon total 153.712067 153.712067 non-bonded energy 153.712067 total potential energy where Vnonbon total internal is the total intramolecular energy, Vnonbon total external is the total intermolecular (interaction) energy, Vnonbon total is the total non-bond interaction energy (it's the sum of Vnonbon total internal and Vnonbon total external). Non-bonded energy and total potential energy are self explaining. Finally, the results (Vnonbond total internal, Vnonbond external and Vnonbond total) are copied to the clipboard. |
Neural network.c |
The AMMP's Kohonen neural network is used to find the 3D space filling curve corresponding to the structure. If the charges aren't assigned, they are fixed by Gasteiger - Marsili method (see AMMP's KOHONEN command). |
Rigid docking.c |
It performs a rigid rocking calculation by genetic algorithm as
implemented in AMMP program. This calculation requires two molecules
in the workspace: the first one must be the receptor and the second one
must be the ligand. This last molecule is moved to obtain the complex.
Both molecules must have the hydrogens and the charges are automatically
fixed (Gasteiger - Marsili method) if they are unassigned. |
By these scripts, it's possible to build complex structures:
Aromaticity fix.c | It fixes the bond order in aromatic rings, changing the alternated single and double bonds to partial double bonds. |
Coordinate transformation.c | This script applies the specified
transformation matrix to all atoms or to visible/active atoms only (see
Active atoms only checlbox). It's useful to build multimeric structures
from the information included in the REMARK 300 and 350 tags of PDB
files.REMARK 300 REMARK 300 BIOMOLECULE: 1 REMARK 300 THIS ENTRY CONTAINS THE CRYSTALLOGRAPHIC ASYMMETRIC UNIT REMARK 300 WHICH CONSISTS OF 2 CHAIN(S). SEE REMARK 350 FOR REMARK 300 INFORMATION ON GENERATING THE BIOLOGICAL MOLECULE(S). REMARK 350 REMARK 350 GENERATING THE BIOMOLECULE REMARK 350 COORDINATES FOR A COMPLETE MULTIMER REPRESENTING THE KNOWN REMARK 350 BIOLOGICALLY SIGNIFICANT OLIGOMERIZATION STATE OF THE REMARK 350 MOLECULE CAN BE GENERATED BY APPLYING BIOMT TRANSFORMATIONS REMARK 350 GIVEN BELOW. BOTH NON-CRYSTALLOGRAPHIC AND REMARK 350 CRYSTALLOGRAPHIC OPERATIONS ARE GIVEN. REMARK 350 REMARK 350 BIOMOLECULE: 1 REMARK 350 APPLY THE FOLLOWING TO CHAINS: B, A REMARK 350 BIOMT1 1 1.000000 0.000000 0.000000 0.00000 REMARK 350 BIOMT2 1 0.000000 1.000000 0.000000 0.00000 REMARK 350 BIOMT3 1 0.000000 0.000000 1.000000 0.00000 REMARK 350 BIOMT1 2 -1.000000 0.000000 0.000000 174.00000 REMARK 350 BIOMT2 2 0.000000 -1.000000 0.000000 174.00000 REMARK 350 BIOMT3 2 0.000000 0.000000 1.000000 0.00000 To build this homodimeric macromolecule:
|
Graphite.r | This script build one or more graphite planes. |
Nanotube.r |
This scripts build a single-walled carbon nanotube (SWCNT) structures. It's based on VBS code developed by Roberto G. A. Veiga at Instituto de Física - Universidade Federal de Uberlândia (UFU) - Brazil, using the algorithm described in the article: White et al., Phys. Rev. B, 1993, Vol. 47, No. 9, pp. 5485-88. |
Peptide library.c | By this script, you can build a peptide library of a given length (Peptide length) starting from a set of residues (Residues to use, usually the 20 natural aminoacid). Optionally, you can indicate one or more base peptide to which the residues are added to the C-terminal side. Each base peptide (Base peptides field) must be separated by spaces, commas, semicolons and tabs. The peptides are built as beta-sheet in zwitterionic form and with the side chains ionized according to the physiological pH. They are stored in a database (see Output box, Database) in any format supported by VEGA ZZ. |
Protein mutagenesis.c |
This script generates mutated proteins from a template structure and a
list of mutations. As first step, it ask if you want to perform all
possible permutation of the mutation or only one mutation for each
column of the mutation file. The output molecules are stored in a database and an
additional CSV file is also generated, containing the molecule names and
their aminoacid sequence. The file of the mutation list must include one mutation for each line in the following format: ResName:ResNum:ChainID:MolNum List_Of_Aminoacids where ResName is the name of the residue (max. 4 characters), ResNum is the residue number (max. 4 characters), ChainID is the chain identifier (1 character), MolNum is the molecule number (non zero, unsigned integer) and List_Of_Aminoacids is the list of the aminoacids that will be sequentially replaced (max. 20 characters, aminocid single character code). ChainID and MolNum are optional parameters, but it you want to specify the molecule number without to indicate the chain, you can use * as ID. # and ; at the beginning of each line can be used for remarks.
Example: ; Mutation list example THR:3 EYF SER:6:Y AL It generates 6 mutants, involving the residues in 3 and 6: EA, YA, FA, EL, YL and FL.
Each mutated protein is automatically minimized by NAMD 2 (5000 steps), keeping the backbone fixed.
WARNING: |
Protonation fix.c | By this script, you can fix the protonation state of the molecule in the current workspace, removing the acid hydrogens (bonded to carboxylate, solphonate, phosphite and phosphate groups) and adding the basic hydrogens (to nitrogens of primary amines and guanidines). |
Solvent cluster racemizer.c | This script creates a racemic mixture from a solvent cluster of chiral molecules built from a single enantiomer. The solvent cluster must be opened in the current workspace. |
Stereoisomers.c |
This script builds all possible stereoisomers from a chiral molecule that must be
opened in the current workspace. Diastereoisomers are automatically
minimized (conjugate gradients, 3000 steps, toler 0.01). For security reasons, the maximum number of chiral centers is limited to 8 (28 = 256 stereoisomers), but it can be incresed to 32 changing the VGP_MAX_CHIRAL_CENTERS and VGP_MAX_CHIRAL_CENTERSSTR definitions. When you start the script, a file requester is show in which you can put the output format and the file name that is used as prefix, because each stereoisomer is named adding the configuration of all stereocenters. You must remember that if the bond order of the starting molecule is assigned in wrong way, the chirality attribution could be incorrect (according to Cahn-Ingold and Prelog rules). |
Zero coord.c |
It moves the atoms at the specified coordinates. Checking Active atoms only, only the visible atoms are moved. |
This directory includes scripts for generic calculations:
APBS membrane energy.c |
This script evaluates the energy required by a molecule to leave the hydration shell
and to reach a biological membrane. This calculation is performed by APBS and both solvents are implicity defined by their dielectric
constants (78.0 for water and 9.0 for membrane). This script uses APBS for Windows that is included in VEGA ZZ package. APBS is a software for modeling biomolecular solvation through solution of the Poisson-Boltzmann equation (PBE), developed by Nathan Baker in collaboration with J. Andrew McCammon and Michael Holst. For more information about APBS, visit http://www.poissonboltzmann.org/apbs/ |
||||||||||||
APBS solvation energy.c |
It calculates the solvation energy of the molecule in the current workspace by
APBS. The results are shown in VEGA ZZ console and copied to the
clipboard. This script uses APBS for Windows that is included in VEGA ZZ package. APBS is a software for modeling biomolecular solvation through solution of the Poisson-Boltzmann equation (PBE), developed by Nathan Baker in collaboration with J. Andrew McCammon and Michael Holst. For more information about APBS, visit http://www.poissonboltzmann.org/apbs/ |
||||||||||||
Copy properties.c | This script copies some molecular properties to the clipboard in selective mode. | ||||||||||||
Database properties.c |
This script calculates several properties for all molecules included in a
database as 3D structures. The script asks for a database in one of the
VEGA ZZ supported formats as input and for a CSV file as output. During
the calculation, a log file is also created in which all errors are
recorded. This script is especially useful for that database formats that don't include molecular properties such as Mol2, Sdf and Zip. |
||||||||||||
Druglikeness.c |
By this script, you can check the druglikeness of the molecule in the current workspace. Two
methods are used:
Lipinski's rule of five
Ghose's rule
The molecular refractivity is calculated according to the Ghose and Crippen method.
References: Ghose, A. K.; Viswanadhan V. N.; Wendoloski, J.J. |
||||||||||||
Elecrostatic energy.c |
It evaluates the electrostatic energy of the molecule in the current workspace. The default dielectric constant is 1 (vacuum). |
||||||||||||
Log kw IAM.MG/DD2.c |
Since the scale of log kwIAM
values was frequently found to better mimic the drug/membrane
interactions actually occurring in vivo than lipophilicity in n-octanol,
this script implements a method to predict the kwIAM
for both MG and DD2 chromatographic columns. In particular, you can
estimate the retention time as log kw
values for a molecule in the current workspace or, alternatively, for
any molecule in
PubChem
database. The results can be copied to the clipboard and if the
descriptors used for the prediction or the calculated log kw
is out of prediction domain, warning messages are shown in the console.
You can choose between two prediction methods that use two different
approaches to predict log P: the former, more accurate, is based on
miLogP and requires to send the data to Molinspiration software and the
latter, less accurate, is based on virtual log P and runs off-line. If
you have to manage sensible data that you don't want to share on the
Web, you should choose the second method.
The predictions virtual log P-based exploit these other correlative equations:
log kwIAM.MG = -0.3867 + 0.4159 VirtualLogP + 0.0741 HeavyAtoms - 0.0806 HLBG - 0.0657 FlexDihedrals n = 205 r2 = 0.75 q2 = 0.74 SE = 0.501 F = 151.79 P < 1.0 10-8 PC = 51.739
log kwIAM.DD2 = -3.0812 + 0.4809 VirtualLogP + 0.5464 Vdiam - 0.0765 HLBPSA - 0.0829 FlexDihedrals n = 161 r2 = 0.80 q2 = 0.79 SE = 0.523 F = 155.22 P < 1.0 10-8 PC = 44.319
where:
WARNING: |
||||||||||||
Mopac.r | It runs multiple Mopac jobs. | ||||||||||||
XLOGP2.c | Calculate the logP by XLOGP V2 method. The
result is shown in VEGA ZZ console and copied to the clipboard. This script requires X-Score 1.3 for Windows that is not included in VEGA ZZ package. To install X-Socre, read the X-Score script manual.
For more information about X-Score and XLOGP, visit http://www.sioc-ccbg.ac.cn/ |
Scripts to color the molecule:
Color RasMol.c | It colors the molecule in the current workspace according to the RasMol color scheme. |
Color VMD.c |
It colors the molecule in the current workspace according the VMD color scheme. |
This directory contains the initialization scripts to include in REBOL scripts:
Fmod.r | Fmod commands. |
Formats.r | File format keywords and other definitions. |
Utils.r | Functions for path manipulation. |
Vega.r | VEGA ZZ interface (don't change it without any real reason !). |
Vutils.r | REBOL/View utilities. |
The C header files contained in this directory are hidden and they can't changed directly by VEGA ZZ environment.
This directory includes communication and Internet-related scripts:
Download molecule from URL.c | This script download a molecule from a given URL. |
E-mail PDB send.c | This script saves the current molecule in PDB format, compress it and attach it to a user-editable e-mail. This script uses the MAPI layer and so it's compatible with MAPI compliant e-mail clients only (e.g. Outlook, Outlook Express, etc). To change the output format or other settings, see the script source code. |
This directory includes scripts to manage databases:
Count functional groups.c | This script counts the functional groups for each molecule in a database. The functional groups are recognized by Kier-Hall SMARTS template, but you can use also ATDL templates such as GROUPS, GROUPS_EXT, TRIPOS, etc. You can change the template by editing VGS_DEF_TEMPLATE constant in the code. If the input database supports SQL, you can decide to save the data in the same database or to a separated CSV file. These data are useful for QSAR analysis in which you need to recognize and count the functional groups. |
Database expander.r | It's a REBOL/View script to extract the molecules contained in a database to a directory. It allows to specify the file format, the compression and the save attributes (connectivity and constraints). |
Database logP.c | It calculates the logP by Testa's MLP
method for each molecule in the database and export the results in a CSV (Output
file) file. The input must be a supported database (Input database)
and its structures can be pre-processed adding the hydrogens (Add the
hydrogens) applying the geometry method (default) or the bond order
method (Use bond order). This last method is recommended if the
molecules have an assigned bond order. In the pre-processing
phase, the structures can be optimized by the steepest descend (Steepest
minimization) and/or the conjugate gradients (Conjugate minimization)
methods. For both minimization algorithm, it's possible to put the number of
iterations (Steps), the toler value (Toler) and the dielectric
constant (Dielectric). Checking Update the graphic, the 3D
graphic output is updated every 20 minimization steps. Increasing the Dot
density value, it's possible to make a better prediction of the logP. A
good value is from 10 to 50 dots for Å2.
Warning: even if in the theory it's possible to manage a 2D database, adding the hydrogens by the bond order method and optimizing the structures, this procedure is not recommended because the distance geometry optimization is not performed. For this reason, a better choice is the conversion of the database from 2D to 3D (see the Database 2D to 3D.c script) and the resulting database can be used directly to predict the logP values. |
Database volume.c | It calculates the volume of each molecule in the database. It have the same options of the Database logP.c script. |
Database to 0D.c | This script converts a 2D or 3D database to a 0D SDF database, translating all atoms at the specified coordinates, usually at (0, 0, 0). |
DrugBank SDF fix.c | The DrugBank SDF files aren't standard,
because the header of each reacord has two lines only instead of three and
the first line contains a tab character to delimit the molecule name from
the DrugBank ID. This script create a new file adding _fix.sdf suffix to the file name and fixing the files adding the missing line, removig the tab character and "SDF file of " string in the molecule name line. |
Force field check.c | This script assigns the force field to each
molecule in the database and checks if it is correctly assigned. An output
file in the same directory of the database file is created and named as the
database followed by - force field check.txt suffix. This script is useful to check if there are problems in atom type assignment before to run a virtual screening calculation. |
Mol2 merge.c | It joins two or more databases in Mol2 format into a new file. This script doesn't perform any change to the data and therefore it's extremely fast. |
SDF merge.c | It joins two or more databases in SDF format into a new file. This script doesn't perform any change to the data and therefore it's extremely fast. |
SDF metadata extractor.c | This script extracts the metadata (e.g. InChi, SMILES, biological activity, etc) from a SDF file and puts it into a Comma Separated Values (CSV) file. The output file is placed in the same directory of the source database and its name is generated from it adding _meta.csv suffix. |
SMILES to database.c | This script converts the SMILES molecules of a CSV file to 3D and puts them in a database. The CSV file must have two fields for each line separated by a semicolon (;): the former must be the molecule name and the latter must be the SMILES string. |
Splitter.c | This script splits a database into more than one file, that can be useful to distrubute calculations on different PCs. The Input database must be in one of the formats supported by VEGA ZZ. |
Subset creator.c | It creates a new database in SQLite format,
including a subset of molecules of another database. The molecules must be
specified in a text file in which molecule names (not ID) are placed one for
each line. The subset database is created in the same directory of the source preserving its name as prefix and adding _subset.db suffix. A log file is generated also in which possible problems are reported. This script was specially developed to prepare input databases for virtual screening studies. |
ZINC get by ID.c | This script downloads a structure from ZINC database to the current workspace by specifying the molecule ID. If the code is wrong or the entry doesn't exist, an error message is shown. |
Scripts for development.
Decision tree to C converter.c | This program converts the machine learning
models, in particular the classification trees, generated by
Weka
program to C source code and requires no or very limited modifications to be
used. It is the conversion of Tree2C command line program to a VEGA VLL
extension. Weka model preparation - Mini how to This part of the manual don't want to be exhaustive and more information can be found in Weka manual and tutorials.
Decision tree conversion to C
When you choose VEGA ZZ C-script, the attribute names are analyzed and if are calculable by VEGA ZZ the right code is automatically added to the output, otherwise a warning message is shown because the resulting code will be incomplete and requires further implementations by the user. |
Scripts for the manipulation of the nucleic acids (DNA, RNA and PNA).
DNA to PNA.c | This tool converts the DNA to PNA, acting only on the selected atoms. |
RNA to DNA.c | This tool converts the RNA to DNA single stranded, acting only on the selected atoms. |
Scripts for molecular docking.
These scripts allow to prepare input files for AutoDock 4:
Box calc.c | It calculates the box dimensions and its center coordinates containing the active (visible) atoms and shows the results in the console. This script is useful to define a macromolecule region to dock ligands. |
DLG to PDB multimodel.c | It converts an AutoDock 4 DLG output to a standard PDB multimodel file, keeping in the remarks the energy information. This conversion is not required by VEGA ZZ that read DLG files as trajectories, but is needed by programs that are unable to manage this kind of files. |
Ki calculator.c | It evaluates the Ki and the interaction
energy of a given ligand - receptor complex. This script is useful to
recalculate the AutoDock 4 score after an energy minimization (e.g.
performed by NAMD). This calculation requires at least two molecules in
the workspace and atom constraints defining the region in which the
AutoDock 4 grid maps will be calculated. The free atoms only are
considered to define this region. If there are more than two molecules
or the ligand is ambiguous, the script ask to specify the molecule ID of
the ligand. The results are shown in the VEGA ZZ console and copied to the clipboard. |
Ligand.c | By this script, you can prepare the current molecule
to be used as ligand with AutoDock 4, performing these steps:
If the molecule has two dimensions only, the 2D to 3D conversion is performed as explained below:
These steps are performed for both 2D and 3D structures:
|
Receptor.r | By this script, you can prepare the
current molecule
as receptor for AutoDock 4, performing these steps:
The pre-defined docking box is set to explore the entire receptor,
but if you want explore a specific protein region, you must select the
atoms defining that region before to run the script. |
These scripts are useful to manage PLANTS docking software.
Docking.c | This script performs a molecular
docking or a virtual screening calculation by PLANTS software, that must
be installed as explained in the manual an in the PLANTS node of the
script tree. The receptor and the ligand must be in Sybyl Mol2 format
and if you want to run a virtual screening the ligands must be included
into a Mol2 database (Mol2 multimodel format). In the graphic interface, some parameters can be set:
By clicking Run button, the calculation starts and a window is shown in which it's possible to stop the run by clicking Abort button.
WARNING: if you close VEGA ZZ, the PLANTS calculation is not stopped, but when it finishes, the scripts doesn't convert the output files to be read directly by Microsoft Excel. For more information about PLANTS, visit http://www.tcd.uni-konstanz.de/
PLANTS installation:
If you installed the 1.1 version built by Mingw32, it's strongly recommended to patch it by running Patch bin 1.1 script. |
Patch bin 1.1.c | This script applies a patch to PLANTS
1.1 binary (Mingw32 version) in order to fix S.O and S.O2 atom types
that are defined in wrong way as S.o and S.o2. A backup copy of the original version of PLANTS is made in ...\VEGA ZZ\Bin\Win32 directory (Plants.bak).
WARNING: To run this script, you need the administrative rights, otherwiese it will be impossible to patch PLANTS. If User Account Control (UAC) is enabled, you must run VEGA ZZ as administrator. To do it, click the VEGA ZZ icon on the desktop with the right mouse button and select Run as administrator. |
Receptor.c | This script saves the receptor in the
current workspace to be used in PLANTS calculations. In particular, it
marks the backbone atoms and bonds by BACKBONE label that is required to
consider the flexibility of the receptor side chains during the docking. WARNING: If you don't need to consider the receptor flexibility, you can save a normal Sybyl Mol2 file from VEGA ZZ main menu. |
Rescore ChemPlp.c
Rescore Plp.c
Rescore Plp95.c |
It evaluates the ligand - receptor
interaction energy by ChemPlp, Plp and Plp95 scoring
functions implemented in PLANTS. This calculation requires at least two
molecules in the workspace and if there are more than two molecules or
the ligand is ambiguous, the script asks to specify the molecule ID of
the ligand. The results are shown in VEGA ZZ console and copied to the clipboard. This script requires PLANTS for Windows that is not included in VEGA ZZ package. |
RMSD calc.c | This script calculates the root mean
square deviation (RMSD) of a given set of poses obtained by a docking
calculation. As reference structure, the first pose of each ligand is
considered that, in case of PLANTS, is the best ranked. The script
calculates also the RMSD (ALNRMSD) aligning each pose to the reference
one (this is useful to evaluate the conformational changes between the
poses) and the mean values of both type of RMSDs. You must specify only
the database including the docking poses (without the target/protein)
and the name of the output CSV file. You can give also a database as input including both receptors and ligands. The script try to detect automatically the ligand and when it's not possible, a requester is shown. |
These scripts allow to prepare input files and to run AutoDock Vina:
Docking.c | This script performs a molecular
docking calculation using AutoDock Vina. The receptor and the ligand
files must be in PDBQT format and can be prepared by Receptor.c
and Ligand.c scripts. In the graphic interface of this script,
you can specify the following parameters:
For more information about AutoDock Vina, click here. |
Ligand.c | It prepares and saves the current molecule
as ligand for Vina, performing these steps:
If the molecule has two dimensions only, the 2D to 3D conversion is performed as explained below:
These steps are performed for both 2D and 3D structures:
|
Receptor.c | It prepares and saves the molecule
in the current workspace as receptor for Vina, performing these steps:
|
Virtual screening.c | This script performs structure-based
virtual screenings by AutoDock Vina. To do them, you need:
The graphic user interface of this script allows to setup the screening in easy way, changing the following parameters:
About the restart |
If you want to run a Vina docking calculation, follow these steps:
13.3.13.4 Other docking scripts
Here are other scripts for generic analysis.
APBS binding energy.c | This script evaluates the binding
energy of a given ligand - receptor complex. This calculation requires
at least two molecules in the workspace and if there are more than two
molecules or the ligand is ambiguous, the script asks to specify the
molecule ID of the ligand. The results are shown in VEGA ZZ console and
copied to the clipboard. This script uses APBS for Windows that is included in VEGA ZZ package. APBS is a software for modeling biomolecular solvation through solution of the Poisson-Boltzmann equation (PBE), developed by Nathan Baker in collaboration with J. Andrew McCammon and Michael Holst. For more information about APBS, visit http://www.poissonboltzmann.org/apbs/ |
Best score of isomers.c | This script was developed with the aim
to manage the docking results obtained when the database of ligands was
expanded with stereoisomers, geometric isomers and tautomers of each
molecule. It chooses the best isomer of a molecule on the basis of the
best (lowest) docking score. When you run the script, you must put the
input file in CSV format including the data (molecule name, scores etc)
of all docked species, the output CSV file, the column with the ligand
names and the column of the score. The isomers are detected by name: they must share the same prefix followed by the underscore character ("_"). |
Contact surface.c | This script measures the ligand/receptor contact surface (shared surface) in a complex. The results are automatically copied to the clipboard and are: contact surface, percentage of contact surface referred respectively to the ligand, receptor and complex surfaces. All data are expressed in Ų. |
Fred2 scrore.c |
It calculates the interaction score of a ligand - protein complex using OpenEye's Fred2 docking software. This calculation requires two molecules
in the workspace: the first one must be the receptor and the second one
must be the ligand. The scores extracted from Fred's outputs are:
Chemgauss2, Chemscore, Plp, Screenscore, Shapegauus and Zapbind. The
results are automatically copied to the clipboard.
Warning: This script requires Fred2 installed on your PC. You can request/buy it at http://www.eyesopen.com/ |
GOLD score extractor.c | This script extracts the docking scores of each pose stored in the mol2 file generated by GOLD and saves them into a csv file. The output file is created in the same directory of the mol2 one and is named as XXX_GOLD.csv, where XXX is the name of the source file. |
Hypervolume analyzer.c |
This script calculates the shared area (hyperarea) and the shared
volume of a set of multiple poses (hypervolume) obtained by a
docking calculation. As input, you must specify the database with the
docking poses in one of the formats supported by VEGA ZZ, while the
output CSV file is saved in the same directory of the database with the
name DATABASE_PREFIX - HyperVol.csv. In the output file, you can
find the following columns:
The multiple poses of the same molecule are detected by their names: they must share the same prefix followed by the underscore character ("_"). |
Mean score of multiple poses.c |
This script calculates mean, minimum, maximum, range and standard deviation of docking scores for all poses of each
ligand. When you run the script, you must put the input file in CSV
format including the data (molecule name, scores etc) of all docked
species, the output CSV file, the column with the ligand names and the
column of the score. The ligands are detected by name: they must share the same prefix followed by the underscore character ("_"). |
Mopac binding enthalpy.c |
This script evaluates the binding enthalpy with MOPAC:
This script requires at least Mopac 2012 for Windows that is not included in VEGA ZZ package. For more information, see Installation of optional components. |
Rescore+.c | This script recalculates the
interaction scores between a given set of ligand poses in a database and
a target molecule or, alternatively, between a ligand and a receptor
both included in a trajectory file. To run the calculation, you must specify the Receptor
file name, the Database including the docked ligands, the CSV output
file to store the scores, the Log file in which are written the errors
and finally one or more scoring functions. For more information about
the scoring functions, you can consult the
VEGA ZZ manual. WARNING: the database must contain ligand poses obtained by a previous docking calculation. This script doesn't perform any kind of docking calculation. To calculate the RPScore, the ligand must be a peptide/protein with the residue names indicated in the sequence. |
RPScore.c | It calculates the RPScore of a given
protein-protein complex or a trajectory of protein-protein complexes. In
this second case, the results are saved to a CSV file. The complex or
the trajectory must open in the current workspace. This script is the VEGA ZZ implementation of the well known RPScore program. For more details: http://www.sbg.bio.ic.ac.uk/docking/rpscore.html Gidon Moont, Henry A. Gabb, and Michael J.E. Sternber, "Use of Pair Potentials Across Protein Interfaces in Screening Predicted Docked Complexes", PROTEINS: Structure, Function, and Genetics 35:364-373 (1999). |
WarpEngine GRAMM extractor.c | This script extracts one or more
complexes from the output generated by GRAMM docking software used in
WarpEngine parallel execution environment. To complete the extraction,
the script asks you:
By default, the script saves the complexes in IFF format and assigns CHARMM force field and Gasteiger-Marsili atom charges. These default parameters can be changed by editing the script code. |
X-Score.c | It evaluates the interaction score of a
given ligand - receptor complex. This calculation requires at least two
molecules in the workspace and if there are more than two molecules or
the ligand is ambiguous, the script asks to specify the molecule ID of
the ligand. The results are shown in the VEGA ZZ console and copied to the clipboard. This script requires X-Score 1.2 or 1.3 for Windows that
is not included in VEGA ZZ package.
|
This directory includes the example scripts:
Benzene | In this folder, you can find several examples showing you how to build a benzene ring using different scripting languages (C-Script, JavaScript, PHP, Python, REBOL). |
HyperDrive | This folder includes C-scripts showing you how to use HyperDrive APIs. |
Log kw IAM | In this folder, there are minimalist codes in different scripting languages to calculate log kwIAM.DD2 and log kwIAM.MG of the molecule in the current workspace. |
Command console.htm | This script demonstrates how it's possible to control VEGA ZZ by JavaScripts in a HTML page. |
Demo.bat | Demo script. |
Demo.r | The same of the above, but written in REBOL. |
Distances.r | This REBOL script explains how to measure interatomic distances. |
Graph.r | Demo of the extended commands to manage the plots. |
GraphApp demo.r | Demo of the GraphApp GUI library. |
Info.r | It shows some information in the VEGA ZZ console. |
Meshload.r | Il loads and shows a 3D rabbit mesh model. |
Mini-XML demo.c | Demo script of Mini-XML library. |
MP3 player.r | Minimalist mp3 player (fmod demo). |
NAMD minimization.c | This script shows how to use the NAMD helper to perform an energy minimization by NAMD 2. It requires only a molecule in the current workspace. |
REBOL View\VEGA ZZ toolbar.r | It shows a REBOL/View toolbar to control the VEGA ZZ main features. |
Requesters.r | Simple demo of the VEGA ZZ built-in requesters. |
VEGA GL.c | Application example of VEGA GL commands. |
This directory includes scripts for file format conversion :
CSSR SOMFA export.c | This script exports the current molecule in CSSR format readable by SOMFA. |
CSV export.c | It saves the current molecule in Comma Separated Values (CSV) format. |
Format conversion.r |
This script performs the batch file format conversion of all molecules contained in a folder. Some parameters can be changed in the dialog window:
|
PDB ren export.c | It exports the molecule in PDB format renumbering the atoms. |
XYZ import.c | It imports XYZ files giving the possibility to adapt the filter to each sub-format. |
These scripts calculate and manage ligand-receptor interaction surfaces.
CHARMM interaction surface.c | It calculates the CHARMM non-bond interaction energy of each ligand-receptor atom pair and project it on the Van der Waals surface. You must enter the molecule ID/number to indicate the ligand. |
Lipophilic interaction surface.c | It calculates the lipophilic interaction of each ligand-receptor atom pair and project it on the Van der Waals surface. You must enter the molecule ID/number to indicate the ligand. |
MEP interaction surface.c | It calculates the electrostatic interaction energy of each ligand-receptor atom pair and project it on the Van der Waals surface. You must enter the molecule ID/number to indicate the ligand. |
MLPInS color ramp.c |
This script normalizes the color ramp calculated by MLPInS interaction
surface script, using the user-defined range of values. The
normalization is useful to compare surfaces of different molecules using
the same color scheme. It recognizes MLPInS surfaces only and changes them selectively. |
MLPInS interaction surface.c | It calculates the MLP Interaction Score (MLPInS) of each ligand-receptor atom pair and project it on the Van der Waals surface. The user must enter the molecule ID/number to indicate the ligand. |
Scripts to create movies.
Movie maker.c |
This script generates a movie file starting from the molecule in the
current workspace, rotating it around one or more axis. The parameters
that the user can change are: Output movie (file name of the
output movie), Number of frames (number of frames to put in the
trajectory),Preview (checking this gadget, the animation is shown
in the main window not saving the output movie), X rotation (rotation in degrees around the X axis),
Y rotation (rotation in degrees around the Y axis) and Z
rotation (rotation in degrees around the Z axis). Clicking Animate, the movie will be created. The codec requester is shown to select the required compression options. Take care choosing the Render mode because not all graphic cards supports the Hardware mode. The Software rendering is the most reliable even if it's unable to reach the Hardware quality. |
Sec. structure anim.c |
This script generates a movie file starting from the peptide in the
current workspace, changing the secondary structure. The parameters that
you can change are: Output movie (File name of the output
movie), Number of frames (number of frames to put in the
animation), Preview (checking this gadget, the animation is shown
in the main window not saving the output movie), Start Phi
(starting value of the Phi dihedral angle), Start Psi (starting
value of the Psi dihedral angle, Start Omega (starting value the
Omega dihedral angle), End Phi (ending value of the Phi dihedral
angle), End Psi (ending value of the Psi dihedral angle), End Omega
(ending value of the Omega dihedral angle). Click Animate to create the movie file. The codec requester is shown to select the required compression options. Take care choosing the Render mode because not all graphic cards supports the Hardware mode. The Software rendering is the most reliable even if it's unable to reach the Hardware quality. For the most common Phi, Psi and Omega values, click here. |
This directory includes the visualization scripts:
Aminoacid selector.r |
It shows the amino acid by selection and/or by chemical/physical properties. |
Dump backbone torsions.c | It dumps the phi and psi backbone torsions of a protein. |
Fasta to text.r |
It convert a Fasta into a text file. That's is useful to load it into Microsoft Excel. |
HIS protonantion.c |
It finds the histidine protonantion state (on NE2 or on ND1) using the CHARMM potential and swap the hydrogens (e.g. H-NE2 to H-ND1) according to the hydrogen bond energy. If the energy difference between the H-NE2 and H-ND1 tautomers is more than 2.0 Kcal/mol the hydrogen is placed on the nitrogen realizing a structure with lower hydrogen bonding energy. The starting structure must have the hydrogens. |
Move hydrogens to end.c |
This script moves the hydrogen atoms to the end of the atom list. In this way, you can obtain files split in two parts: the first one containing the heavy atoms and the second one, placed at the end, containing the hydrogens. As an example, that's useful to write mol2 files compatible with GOLD docking system. |
Score.c |
It calculates the interaction score between a ligand and a generic target biomacromolecule. The ligand must be previously docked in the target
structure. This calculation requires two molecules in the workspace: the
first one must be the receptor and the second one must be the ligand. The script can calculate:
The results are automatically copied to the clipboard. |
13.3.18.1 Homology modelling services
This folder includes on-line services for homology modelling.
FUGUE.htm |
FUGUE is a program for recognizing distant homologues by
sequence-structure comparison. It utilizes environment-specific
substitution tables and structure-dependent gap penalties, where scores
for amino acid matching and insertions/deletions are evaluated depending
on the local environment of each amino acid residue in a known
structure. Given a query sequence (or a sequence alignment), FUGUE scans
a database of structural profiles, calculates the sequence-structure
compatibility scores and produces a list of potential homologues and
alignments. For more information, visit this Web site: http://tardis.nibio.go.jp/fugue/ |
I-TASSER.htm | I-TASSER server is an Internet service for protein structure and function predictions. 3D models are built based on multiple-threading alignments by LOMETS and iterative TASSER assembly simulations; function inslights are then derived by matching the predicted models with protein function databases. I-TASSER (as 'Zhang-Server') was ranked as the No 1 server for protein structure prediction in recent CASP7, CASP8 and CASP9 experiments. It was also ranked as the best for function prediction in CASP9. The server is in active development with the goal to provide the most accurate structural and function predictions using state-of-the-art algorithms. |
Phyre 2.htm | Protein Homology/analogY Recognition Engine. |
ROBETTA.htm |
Robetta provides both ab initio and comparative models of protein domains.
It uses the ROSETTA fragment insertion method (Simons et al. (1997) J
Mol Biol. 268:209-225). Domains without a detectable PDB homolog are
modeled with the Rosetta de novo protocol (Bonneau et al. (2002) J Mol
Biol. 322:65-78). Comparative models are built from Parent PDBs detected
by UW-PDB-BLAST or HHSEARCH and aligned by various methods which include
HHSEARCH, Compass, and Promals. Loop regions are assembled from
fragments and optimized to fit the aligned template structure (Rohl et
al. (2004) Proteins 55:656-677). The procedure is fully automated. For more information, visit this Web site: http://robetta.bakerlab.org/ |
SWISS-MODEL.htm |
SWISS-MODEL is a fully automated protein structure homology-modeling
server, accessible via the ExPASy web server, or from the program
DeepView (Swiss Pdb-Viewer). The purpose of this server is to make
Protein Modelling accessible to all biochemists and molecular biologists
worldwide. For more information about the service, visit: http://swissmodel.expasy.org/ |
PubChem-related scripts. They requires an Internet connection.
13.3.19.1 PubChem database rename
Scripts to rename the molecules in a database.
By CID.c | This script allows to rename all molecules in a database according to CID code. A log file containing the errors is automatically created in the database directory by adding "- rename.log" as suffix to the database file name. |
By IUPAC.c | This script allows to rename all molecules in a database according to IUPAC name. A log file containing the errors is automatically created in the database directory by adding "- rename.log" as suffix to the database file name. |
By name.c | This script allows to rename all molecules in a database according to the most common name in PubChem. A log file containing the errors is automatically created in the database directory by adding "- rename.log" as suffix to the database file name. |
Multiple by CID.c |
This script downloads multiple molecules to a directory by specifying their
CID in a CSV file (with semicolon separated fields). This file must
contain the first line with the labels, the first column with CIDs and,
optionally, a second column with the molecule names that are used for
the files. The molecules are downloaded in 3D SDF format and if an error occurs, it is reported in the log file that has the same prefix of CSV one and " - download.log" as suffix. CSV file example with CIDs only: CID 10075246 10110916 10111186 10114637 CSV file example with CIDs and names: CID;Name 10075246;"Mol 1" 10110916;"Mol 2" 10111186;"Mol 3" 10114637;"Mol 4" |
Multiple by name.c |
This script downloads multiple molecules to a directory by specifying their
name in a text file. This file must contain the name of the molecules to
download one for each line. The molecules are downloaded in 3D SDF
format and if an error occurs, it is reported in the log file that has
the same prefix of the input one and " - download.log" as suffix. Text file example: Ethanol Benzene Aspirin Phenol |
Single by CID.c | It downloads a structure from PubChem to the current workspace by specifying the CID code. If the code is wrong, an error message is shown. |
Single by name.c | It downloads a structure from PubChem to the current workspace by specifying its name. If the molecule is not available, an error message is shown. |
CID.c | This script asks PubChem for the CID code of the molecule in the current workspace. If the molecule is not included in the database, an error message is shown. The molecule is identified by submitting its SMILES string. |
IUPAC name.c | This script asks PubChem for the IUPAC name of the molecule in the current workspace. If the molecule is not included in the database, an error message is shown. The molecule is identified by submitting its SMILES string. |
Name.c | This script asks PubChem for the name of the molecule in the current workspace. If the molecule is not included in the database, an error message is shown. The molecule is identified by submitting its SMILES string. |
Multiple IUPAC names.c |
This script asks PubChem for the IUPAC names of the molecules by
specifying their CID in a CSV/text file. The first line of this file can
be the column label (not mandatory). The IUPAC names are stored in a CSV
file that can be specified by the user. Input file example: CID 243 3339 128563 3236 Output file example: CID;IUPAC 243;"benzoic acid" 3339;"propan-2-yl 2-[4-(4-chlorobenzoyl)phenoxy]-2-methylpropanoate" 128563;"methyl (2S,4aR,6aR,7R,9S,10aS,10bR)-9-acetyloxy-2-(furan-3-yl)-6a,10b-dimethyl-4,10-dioxo-2,4a,5,6,7,8,9,10a-octahydro-1H-benzo[f]isochromene-7-carboxylate" 3236;"1-(4-ethylphenyl)-2-methyl-3-piperidin-1-ylpropan-1-one" |
XLogP.c | This script gets the XLogP name of the molecule in the current workspace from PubChem. If the molecule is not included in the database, an error message is shown. The molecule is identified by submitting its SMILES string. |
Scripts for QSAR.
Data normalizer.c | The script normalizes the values of the specified columns in 0-1 range of a given spreadsheet in CSV format, assuming that the first row is the header of each column. The output spreadsheet is saved using "- normalized.csv" extesion to the file name. |
Principal component analysis.c | This scripts performs the Principal Component Analysis of a given dataset in CSV format. You can chose the columns to include in the matrix to be analyzed. The script saves two files: the first one includes statistical data for each selected column such as the mean of the values, their standard deviations and the PCA results such as the eigenvalues, the eigenvectors and the coefficients to project the data in the PCA space, whose values are in the second file. The PCA calculation is done only for the first three principal components. |
Table join.c | This script joins two or more tables in CSV format. That's useful when the number of colums/rows is too large to be managed by Microsoft Excel. You can select an unlimited number of tables/spreadsheets and you can specify individually the join position (Bottom or Right). The output file is automatically saved when you stop to add other spreadsheets clicking Cancel in the file requester and its name is obtained from the first file by adding - join.csv extension. |
Training and test set creator.c |
This script helps the user to create a random training and test sets from
a given data set in CSV format. This is useful to validate a QSAR model,
by calculating the linear regression of the training set and using the
test set to predict the dependent variable. You can create homogeneus sets (the script ask you if you want that) in terms of mean and standard deviation. You can select the properties that you want to keep homogenus in both sets. The script standardizes the data and performs several trials to split randomly the two sets. When the differences between the traing and test set of means of the means and the means of the standard deviations of the properties is less than a user-defined value (0.01) the iterative process is stopped. This script writes two CSV files as output for the training set and for the test set, respectively adding to the file name "- training" and "- test". |
Scripts for linear regression.
Automatic linear regression.c |
This script generates automatically all possible multiple regression
models by these steps:
The script requires a CSV file as input, that can be exported from your preferred spreadsheet software (e.g. Microsoft Excel) and generates an output file with the same prefix of the input followed by - regression.txt as name. The output file includes some information as the best independent variables, the collinear variable pairs, all regression models and the best regression models (three for each number of regressors). |
Linear regression.c | This script performs the multiple linear regression and requires a CSV file as input. In two steps, you can select the dependent variable (usually the activity) and the independent variables from the list built from the first row in the spreadsheet. |
Model validator.c |
This script allows the QSAR models to be validated by splitting randomly
the whole dataset in a number of training and test set pairs. For each
training set, the regression coefficients are calculated to evaluate the
test set in terms of standard deviation of errors, angular coefficient,
intercept and r2 of the trend line
of the chart of the predicted vs. experimental activities. To use this script, you must specify the file containing the data of the regression analysis that must be in CSV format and can be exported in easy way from your preferred spreadsheet. Thus, you must select the dependent variable (usually the activity) and the independent variables of the QSAR model that you have found previously for example by Automatic Linear Regression script. Finally, you must put the number of molecules of the training set and the number of random trials. At the end of the calculation, a CSV output file is written in the same directory of the data file by adding "- validation.csv" suffix to the original file name. This output can be opened by a spreadsheet and it includes columns as shown below:
The output file includes also the mean (Mean) and the standard deviation (StdDev) of the previous columns and the labels of the columns selected as dependent (DepVar) and independent (InDepVar) variables. |
Scripts for the analysis of virtual screening results.
CSV to SVM light.c |
This script converts a standard CSV file to SVM Light format. It
requires the molecule names as first column and an activity /
dependent variable column that you can choose by a requester. Moreover,
you can select also the dependent variables that are exported to the
output file. For more information, read http://svmlight.joachims.org/. |
Enrichment factor analysis.c | This script helps to to setup a virtual screening calculation by analyzing the enrichment factor that you can obtain by screenings on sets including true-active and decoy molecules. The data must be in CSV file format and you can select the activity and score columns. Moreover, you can also specify the activity threshold to indicate when a molecule must be considered active or not and the cluster size for the cluster analysis. The script sorts the rows in ascending order on the basis of the score/property used to predict the activity, thus performs the cluster analysis showing the results in a bar plot. If the score/property can successfully detect the active compounds, they must be ranked at the top of the sorted list populating the first clusters. The enrichment quality is evaluated in terms of skewness and kurtosis. In particular, a kurtosis value close to zero indicates a Gaussian distribution, otherwise an high value is synonym of an asymmetric curve. The aim of this kind of analysis is to obtain an highly asymmetric curve translated on the left of the plot and this result can be obtained when the kurtosis value is high. Just to give you an idea, kurtosis values less then 5 can be considered poor and, on the contrary, values greater than 5 are good. |
Enrichment factor optimizer (manual).c |
This script can be used to improve the enrichment factors of a virtual
screening analysis. More in detail, it allows a new scoring function to
be obtained, resulting from the linear combination of two or more
user-defined descriptors such as docking scores and molecular
properties. The coefficients of this first-degree equation are
calculated by maximizing the number of the active compounds in the top
of the list in which the molecules are ranked by the score calculated
through the new equation. The maximization is performed by the
gradient-free Hooke-Jeeves algorithm and, in order to avoid local
maxima, a random sampling is also applied. As input, a CSV file is
required, containing one activity and several score/properties columns
that you must select. Moreover, you must also specify the activity
threshold to indicate when a molecule must be considered active or not.
The output is shown in the VEGA ZZ console as in the following example:File name.....................: bestranking.csv Activity column...............: ACTIVITY Activity range................: 0.00 - 1.00 Activity threshold............: 0.50 Number of molecules...........: 2513 Number of active molecules....: 38 Max. minimization steps.......: 5000 RMS to stop minimization......: 0.001 Random sampling steps.........: 36 Random selection probability..: 1.51 % Score = 1.0000 SCORE_0000 + 0.2309 SCORE_+000 - 0.5851 SCORE_0+00 Top % Mols Act Act % EF ================================= 1.00 25 4 16.00 10.58 2.00 50 6 12.00 7.94 5.00 125 13 10.40 6.88 10.00 251 18 7.17 4.74 20.00 502 24 4.78 3.16 The coefficients of the equation are divided by the coefficient of the first term. |
Enrichment factor optimizer.c |
The script uses the same approach of Enrichment factor optimizer (manual).c, introducing
the automatic selection of the variables to obtain the best mathematic
models in terms of enrichment factors. You can select the activity,
the independent variables/molecular descriptors and scores to be
combined to obtain the maximum enrichment factor. Although this script
has a parallel design, it could require a long time to complete the
calculation, especially when you select a large number of equation terms
(more than three). You can also specify the threshold for the detection
of active and inactive compounds, the number of variables used to build
the models and the cluster size for the cluster analysis. The results are sorted from best to worst enrichment factor and saved in a CSV file that can be analyzed by your preferred spreadsheet. The output file (named prefix - model.csv) includes several columns:
This script performs also the validation of the best models by building five pairs of training and external sets (with 70/30 % ratio) from the starting dataset. Training set is used to recalculate the models and external set to predict the activity. The results of this analysis are saved to prefix - valitadion.csv in which are present the same data as for the models obtained from the whole dataset with the exception of population of the clusters. The headers of the columns are named with ts and es prefix to identify respectively the training and the external sets. |
It contains scripts for trajectory management.
Anim maker.c | This script generates a trajectory file
starting from the molecule in the current workspace, rotating it around one
or more axis. That's useful to create video files. The parameters that the
user can change are:
|
APBS trajectory.c | This script calculates the solvation energy
for each frame included in a MD trajectory and save the values in a CSV
file. It uses APBS for Windows that is included in VEGA ZZ package.
APBS is a software for modeling biomolecular solvation through solution
of the Poisson-Boltzmann equation (PBE), developed by Nathan Baker
in collaboration with J. Andrew McCammon and Michael Holst. For more information about APBS, visit http://www.poissonboltzmann.org/apbs/ |
Automatic quenching.r | This script extracts the frames from a
trajectory file, then minimize them using AMMP or Mopac. The results will be
stored to an output trajectory file. You can input some parameters:
|
DCD fix for VMD.c | All pre-3.0.0 VEGA ZZ releases write buggy DCD files that aren't readable by VMD. This scripts fix the problem patching the DCD trajectory only if the problem is detected. |
Dump energy.c | This script calculates the energy for each
MD frame and dumps the molecular mechanics energy components in a CSV file.
It also performs a histogram analysis.
|
Enantiomerizer.r | It converts the trajectory to another format
inverting all chiral atoms. You can specify the following parameters:
Click Go ! button to start the conversion and Cancel button to close the window. |
Frame extractor.r | It extracts the frames from a trajectory file (Input Traj.), saving them in the specified directory (Output Dir.). You can change Quenching step, Output format and Compression method. |
NAMD SMD force plot.c | This script shows the force/frame, force/distance and distance/frame of a steered molecular dynamics simulation by reading the NAMD output file. |
PELE PDB fix.c | This script fixes the non-standard PDB
files generated by PELE to be read by VEGA ZZ. For more information about PELE, click here. |
Ramachandran.c |
This script performs the Ramachandran analysis for each trajectory frame. Before running it, you must open a trajectory file. For each frame, the Phi and Psi backbone torsion angles are measured and evaluated if they are inside or outside the Ramachandran permission areas. For each frame is calculated the percentage referred to the total number of the residues and these values are visualized in a plot. This calculation is useful to highlight the secondary structure evolution during a MD simulation. If the percentage of the residues (Phi and Psi values) inside the permission areas is decreasing during the simulation, it means that the secondary structure evolves to a worse situation. Vice versa, if the percentage is growing, the secondary structure is improving. |
SDF export.c | It converts the current trajectory in a SDF database. Each structure in the database is equivalent to each frame in the trajectory file. |
Water remover.r |
It eemoves all water molecules from a trajectory converting it into a PDB multimodel file. This script is obsolete and it's maintained as example only. The same function is now implemented in VEGA ZZ without external scripts. |
This directory includes the generic scripts. Some of these require REBOL/View.
Bin2h.c | This script for developers converts a binary file to a C header file including a bite vector or a Base64 encoded string. In this last case, to decode the data, you can use HD_Base64EncodeMem() HyperDrive function by including hdbase64.h file. |
Calculator.r | Simple calculator (script by Ryan S. Cole). |
Calendar.r | Calendar and scheduler (script by Sterling Newton). |
Clock.r | Digital clock (script by Carl Sassenrath). |
Console.r | It opens the REBOL console. |
CPU load.c | It shows the CPU load in a small window. |
Image viewer.r | Image viewer. |