3. Usage

At the present time, the only way to run VEGA is the execution from a command shell. If you execute the program without parameters, a list of the implemented options is printed:

VEGA V1.5.0 - (c) 1996-2003, Alessandro Pedretti & Giulio Vistoli
Virtual logP by Bernard Testa et al.
Windows 9x/ME/NT/2000/XP Pentium version.

Synopsis: vega INPUT ... -o[OUT.PACK] -f[OUTPUT_FORMAT] -p[FORCE_FIELD]
          -s[POINTS] -g[RADIUS] -c[TEMPLATE] -k[KEYWORDS] -a[RES_NUM]
          -d[DIELECTRIC] -e[NAME:NUM] -i[SHELL RAD SHAPE] -m[KEYWORDS]
          -t[PORT NUMBER] -bhnrw

a -> renumber residues starting from RES_NUM
b -> don't save the connectivity
c -> charge template
d -> dielectric constant for energy calculation
e -> residue for energy calculation
f -> output format
g -> probe radius for SAS
h -> show this help
i -> solvate the molecule
k -> keywords for Mopac
m -> keywords for trajectory analysis
n -> normalize coordinates
o -> output file name
p -> define force field to apply
r -> remove hydrogens
s -> point density for SAS
t -> port number (Win32 only)
w -> remove waters

INPUT formats:
Alchemy, BioDock, CAR, CHARMm CRD, CHARMm DCD, CSSR, ESCHER NG, GAMESS,
Gromacs/Gromos mol, Gromacs XTC, HIN, IFF, Mol2, Mopac, MDL mol, MSF, PDB,
PDBA, PDBF, QMC, Quanta CSR, XYZ.

OUTput formats:
Calc:     CVFF, Info.
Map:      BiosymSrf, ComfaFld, CsvIlm, CsvLogP, CsvMep, CsvSrf, QuantaIlm,
          QuantaLogP, QuantaMep, QuantaSrf.
Molecule: Alchemy, CRD, CSSR, Fasta, GAMESS, Gromos, GromosNm, IFF, MdlMol,
          Mol2, MSF, PDB, PDB2, PDBQ, PDBA, PDBF, PDBNOTSTD, PSFX, QMC,
          OldBiosym, Biosym, MopInt, XYZ.
Plot:     BinPlt, CSV, QuantaPlt.
VRML:     Vrml, VrmlPts, VrmlCpk, VrmlSol.

PACKer formats:
bz2 (BZip2), gz (GZip), pp (PowerPacker), z (Z-Compress).

TRAJECTORY keywords (-m):
Angle A1 A2 A3, Dipole, Distance A1 A2, ILM, LipoleBr, LipoleCr,
PlaneAng A1 A2 A3 A4 A5 A6, PSA, Surface A1 ..., SurfDia A1 ...,
Torsion A1 A2 A3 A4, VlogP, VolDia, Volume.

All parameters are optional with the exception of the input file name (INPUT). The copyright version for logP don't appear if you have a full VEGA release with the logP calculation routines that are not free.

 

3.1 INPUT …
With this option, you can specify the name of the input file. An intelligent algorithm implemented in VEGA recognizes automatically the input file format. The supported input formats are: Accelrys Quanta/CHARMm CRD and DCD, Cambridge Data File (CSSR), Gromos/Gromacs .gro, Interchange File Format (IFF), Tripos Sybyl (Mol2), Accelrys Quanta MSF, Protein Data Bank (PDB), Protein Data Bank Fat (PDBF), Accelrys Insight .car, Mopac internal coordinates, Cartesian coordinates (XYZ).
You can load more than one file at once with the same or different file formats to create a molecular assembly. The calculation of connectivity is performed separately for each file loaded to prevent the connectivity errors of bumping molecules.
Starting from release 1.2, the Data Decompressor Engine was implemented. This VEGA module allows to use compressed files without previous unpacking procedure. In other hands, packed files can be managed in the same way of the unpacked files without any external data decompressor. VEGA supports the following packing formats:

Format name File extension
BZip2 .bz2
GZip .gz
PowerPacker .pp
Unix Un/compress .Z

 

3.2 –a[RESNUM]
This function renumbers all residues starting from [RES_NUM]. If this value is not specified, VEGA starts from one. The residue renumbering is very useful when you create an assembly starting from two or more molecules.

 

3.3 -b
With this switch, you can disable the connectivity saving for those molecular formats that can store such information (e.g. PDB, PDBF, IFF). Many molecular packages interpret incorrectly the CONECT field in PDB files and thus, to solve this problem, you can save the molecule without connectivity data.

 

3.4 –c[TEMPLATE]
At the present time, VEGA can assign the partial atomic charges in two ways: the first one is based on the Gasteiger-Marsili method and the second one relies on a fragment database with the atomic charges calculated with semi-empirical or ab-initio methods. With this second method you can’t assign the charges if the fragment is not present in the database.
The Gasteiger method is universal and uses a multi-step procedure:

 

3.5 –d[DIELECTRIC]
If you want to calculate the interaction energy (see –f[FORMAT] option) with a specific dielectric constant, you can use this parameter. Please note that the default value of dielectric constant is stored in the prefs file and usually it is set to 1.0.

 

3.6 –e[NAME:NUM]
This is a compulsory parameter for the interaction energy evaluation (see –f[FORMAT] option), it is required to know which residue (ligand) the energy has to be computed referred to. You can specify the residue number only or residue name and reside number (e.g –e 9999 or –e THA:999). If you use –e without any argument, VEGA uses the default residue number stored in the prefs file (usually 9999).

 

3.7 -f[OUT.PACK]
With this parameter, you can create an output file in a specific file format. If –f is omitted, the default output format is PDB full standard (see PDB specifications) unpacked. OUT indicates the format and PACK is the optional compression method (bz2, gz, pp and z, see INPUT). This two keywords are case-insensitive.

e.g.    -f CSSR    CSSR output without compression.
-f pdb.Z PDB output with Unix compression.
-f xyz.bz2 XYZ  output with BZip2 compression.

 

3.7.1 Calculation formats

Keyword

Description

CVFF

Evaluation of interaction energy.

INFO

Information about the molecule.

3.7.1.1 Evaluation of interaction energy
VEGA can evaluate the ligand-biomacromolecule interaction energy through the molecular mechanics calculation of non-bond term (R6-R12 Lennard-Johnes) and coulombic term. At the present time, only the CVFF force field is implemented. Please remember that ligand and receptor must have the force field and atomic charges correctly assigned (see –p and –c options). For the energy evaluation, you can specify the dielectric constant with –d option (default 1.0) and the ligand (see –e option). After the energy calculation, VEGA shows (or writes in a file) the total interaction energy, its components (non-bond and coulombic energies) and a table with receptor residues that have a partial interaction energy greater than 1% of the total energy. This threshold can be changed in the preference file (click here for more information).

3.7.1.2 Information about the molecule
If you want more information about the input molecule, you can use –f INFO option. When you select this operation, VEGA shows many information: total number of atoms, number of heavy atoms, number of residues, number of molecules contained, number of water molecules, molecular weight, coordinates of geometric center, coordinates of mass center, approximative dimensions, total charge (calculated using the atomic charges), dipole, surface area, surface diameter, volume, volume diameter, ovality (only if the probe radius used for surface calculation is null, see -g option), Crippen's logP and lipole, Broto's logP and lipole, Virtual logP (available only in full release), predicted charge (only for proteins, it's calculated searching ionizable groups),  aminoacidic charge (only for proteins, it’s calculated at physiological pH on the basis of aminoacidic composition), aminoacidic or nucleotidic composition:

************************************
**** Information about molecule ****
************************************

Atoms..............: 48
Heavy atoms........: 25
Residues...........: 1
Molecules..........: 1
Waters.............: 0
Molecular weight...: 345.384 Daltons
Geometry center....: 7.1076 3.6789 0.5790
Mass center........: 6.9492 3.5914 0.5256
Appx. dimensions...: 17.4088 10.7721 10.7163
Total charge.......: 0.0003
Dipole.............: 1.0292 Debye
Surf. area (0.00)..: 383.3 Ų (ds=11.0 Å)
Polar area (PSA)...: 50.6 Ų (apolar=332.7 Ų)
Volume.............: 362.3 ų (dv=8.8 Å)
Ovality............: 1.6
logP (Crippen).....: 1.9275
Lipole (Crippen)...: 0.4363
logP (Broto).......: 3.0390
Lipole (Broto).....: 0.4755
Virtual logP.......: 3.1402

Please note that the total number of atoms exceeds the MAXATMINFO key in prefs file, surface area, surface diameter, volume, volume diameter, ovality and logP values are not showed.
If the molecule is a protein or a nucleic acid, the following data are showed:

...

Total charge.......: -23.0004
Predicted charge...: -24
Aminoacidic charge.: -24

Aminoacidic composition:

Res    N.   N. %     Mass     Mass %
====================================
ALA    46   6.29  3269.690    3.57
ARG    42   5.75  6618.506    7.22
ASN    29   3.97  3309.140    3.61
ASP    43   5.88  4921.520    5.37
CYS    18   2.46  1855.515    2.02
GLU    53   7.25  6789.680    7.41
GLN    46   6.29  5894.132    6.43
GLY    40   5.47  2282.165    2.49
HIS    26   3.56  3565.805    3.89
ILE    37   5.06  4186.861    4.57
LEU    86  11.76  9731.422   10.62
LYS    30   4.10  3875.539    4.23
MET    11   1.50  1443.115    1.57
PHE    25   3.42  3681.328    4.02
PRO    35   4.79  3401.088    3.71
SER    42   5.75  3657.370    3.99
THR    24   3.28  2426.550    2.65
TRP    17   2.33  3165.578    3.45
TYR    35   4.79  5710.992    6.23
VAL    46   6.29  4560.075    4.98

Warning:
If the protein doesn't have got hydrogens, the predicted charge isn't showed. If protein contains special non-aminoacidic groups and/or metal ions, the predicted charge can be uncorrect.

 

3.7.2 Molecule formats

Keyword

Description

ALCHEMY Alchemy format.

BIOSYM

New Biosym .car file (archive 3).

CRD

CHARMM text file format.

CSSR

Cambridge Data File.

FASTA

FASTA is not a real molecular file, because it can store only the primary structure of proteins and DNA/RNA sequences.

GAMESS Cartesian GAMESS format.

GROMOS

This is the special file format of the molecular mechanics package Gromos/Gromacs.

GROMOSNM

GROMOS with the coordinates in nanometers.

IFF

Interchange File Format. This is a binary file with an AmigaOS chunk structure (like IFF-ILBM, AIFF, etc). All chunks are optional and the structure is totally expandable (see Appendix D).

MDLMOL MDL Molfile.

MOL2

Tripos Sybyl Mol2 file format.

MOPINT

The Mopac internal coordinates file (.dat) is useful to link Mopac with other software packages. The Mopac keyword CHARGE is automatically calculated by atomic charges. Other keywords can be specified with –k[KEYWORDS] option. The preferences file of VEGA (prefs in Data directory) contains a special record Mopac keyword used by default.

MSF

MSI Quanta binary file. Its complexity and the poor documentation available have not allowed a full implementation of this format. You can only overwrite an existing MSF file (that must be compatible with the input), but not create a new file.

OLDBIOSYM

Old Biosym (Accelrys) .car file (archive 1).

OPENGL

It's not a real molecular format. It's only a switch to force the visualization of the molecule with the OpenGL graphic interface. This function is available  only in the Win32 version of VEGA package.

PDB PDB pre-2.0 specifications.

PDB2

PDB 2.2 full standard (default).

PDBA

PDB full standard with special records to include atomic charges, force field parameters and ATDL description for each atom.   It’s totally compatible with the PDB standard, because the extra information are placed in REMARK records.

PDBF

PDB full standard with special REMARK records to include atomic charges and force field parameters. It’s also totally compatible with the PDB standard.

PDBNOTSTD

Simplified PDB format, more compatible with software packages that have a partial implementation of Brookhaven specifications. Special records (HETATM, TER, CONECT and MASTER) are not used.

PDBQ

PDB full standard with atomic charges placed in the last right column.

PSFX PSF topology in X-Plor sub-format required for molecular dynamics (e.g. CHARMM and NAMD).
QMC CSSR variant.

XYZ

Cartesian coordinates file. The first record is the total number of atoms and the next records are for each atom. The atom record contains the element name and X, Y, Z Cartesian coordinates.

 

3.7.3 Plot formats
All these output formats are useful for trajectory analysis (see –m [KEYWORDS] option)

Keyword

Description

BINPLT

Generic binary plot. It’s a sequence of single precision floats in big endian format.

CSV

ASCII text file with each field separated by a semicolon.

QUANTAPLT

Accelrys Quanta plot file.

 

3.7.4 Surface and map formats
VEGA can calculate Van Der Waals and accessible to solvent molecular surface. To enable this function you have to use the –f[OUTPUT_FORMAT] option as shown in the following table:

Keyword

Type

Description

COMFAFLD Text COMFA 3D field. When you select this output, you must specify the field type with -m[KEYWORD] option. A Sybyl .rgn file is needed as input also. At the present time, the only implemented filed is vlogP*.

BIOSYMSRF

Text

Van Der Waals and accessible to solvent molecular surface for Insight II package.

CSVILM Text Molecular hydropathicity index (ILM) surface in CSV (Comma Separated Values) format.

CSVLOGP*

Text

Virtual logP surface in CSV format.

CSVMEP

Text

Molecular Electronic Potential (MEP) in CSV format.

CSVSRF

Text

Van Der Waals and accessible to solvent molecular surface in CSV format.

QUANTAILM Binary Molecular hydropathicity index (ILM) surface in Quanta format.

QUANTALOGP*

Binary

Virtual logP surface in Quanta format.

QUANTAMEP

Binary

Molecular Electronic Potential (MEP) in Quanta format.

QUANTASRF

Binary

Van Der Waals and accessible to solvent molecular surface for Quanta package.

The default calculation is the water accessible surface    (1.4 Å sphere radius). To change the solvent radius (probe), you can use the –g[RADIUS] option. If you set the probe radius to null, VEGA calculates the Van Der Waals surface. The standard point density is 10 for one Å2. See –s[POINTS] option to change this value. Click here if you want more information about the surface calculation method.

* Available only in full release of VEGA.

 

3.7.5 VRML formats
In order to support the web publishing, the Virtual Reality Modeling Language (VRML) was implemented in VEGA. To use this function you can use the –f[OUTPUT_FORMAT] option with the following keywords:

Keyword VRML output

VRML

VRML 1.0 wireframe representation with standard coloring method.

VRMLCPK

VRML 1.0 CPK representation with standard coloring method.

VRMLPTS

VRML 1.0 dotted surface representation.

VRMLSOL

VRML 1.0 Van Der Waals and accessible to solvent molecular solid surface

The VRML surface formats can also accept the same options of standard surface outputs (see section 3.7.4).

 

3.8 –g[RADIUS]
If you want calculate a surface map with a probe radius different than the default one (the default value is the 1.4Å water radius) without change the prefs file, you can use this option. Please remember that in orded to calculate the Van Der Waals surface, you must set this parameter to zero.

 

3.9 –i[SHELL RAD SHAPE]
VEGA can solvate a molecule virtually with any type of solvent (e.g. H2O, CCl4, etc). The cluster file must be placed in Data/Clusters (Data\Clusters) directory and can be in any VEGA supported format (also packed). This is a solvent assembly with cubic shape (usually with dimension of 50x50x50 Å ), optimized, with uppercase file name without extension (e.g. WATER, CCL4, etc).
SHELL is the solvent cluster name (e.g. WATER). SHAPE is the form of solvatation cluster: BOX for cubic clusters, SPHERE for spherical clusters and LAYER to solvate with a layer of solvent. RAD is a value in Å that followed by BOX, defines the box side, by SPHERE, the sphere radius and by LAYER the layer thickness.

 

3.10 –k[KEYWORDS]
This option is useful to pass the control keywords when a Mopac input file is generated (-f MOPINT option). Remember to use quotas (") if the number of keyword is more than one. In the prefs file, you can specify the default Mopac keyword.

 

3.11 –m[KEYWORDS]
This option is needed if you want make a specific measure in a molecular dynamic trajectory file (CHARMm, Gromacs XTC, PDB multimodel and Quanta CSR formats). You must specify a keyword to set  the type of measure and eventually the atom selection:

Keyword Description
ANGLE A1 A2 A3 Bond angle.
DISTANCE A1 A2 Bond length.
DIPOLE Molecular dipolar moment.
ILM Molecular lypophilicity index (water cluster required).
LIPOLEBR Lipole (Broto & Moreau)
LIPOLECR Lipole (Ghoose & Crippen)
SURFACE A1 ... Surface area.
SURFDIA A1 ... Surface diameter. It's the diameter of a theoretical sphere
with the surface area of the molecule.
PLANEANG A1 A2 A3 A4 A5 A6 Angle between planes defined by A1, A2, A3 and A4, A5, A6.
TORSION A1 A2 A3 A4 Torsion angle.
VLOGP Virtual logP*.
VOLUME Molecular volume.
VOLDIA Volume diameter.  It's the diameter of a theoretical sphere
with the volume of the molecule.
PSA Polar surface area.

* Available only in full release of VEGA.

To select each atom required in the mesure (e.g. A1 A2 etc), you must use  the atom number only, or the following syntax: ATOM:RESNAME:RESNUM. RESNAME and RESNUM are optional if ATOM is univocal. Suppose to have a benzene ring and you would like indicate the third atom, like showed in the following PDB file:

...
ATOM      2  C2  BEN     1       -0.695   1.203  -0.002  1.00  0.00
ATOM      3  C3  BEN     1       -1.389   0.000   -0.006  1.00  0.00
ATOM      4  C4  BEN     1       -0.695  -1.203  -0.007  1.00  0.00
...

you can use, without differences, 3 or C3 or C3:BEN or C3:BEN:1. If you want select the atom 482 in a polypeptidic sequence where only one proline is present, you can indicate it with 482 or CA:PRO or CA:PRO:32, but not CA only:

...
ATOM    481  N   PRO    32      -29.658  -2.153   7.524  1.00  0.00
ATOM    482  CA  PRO    32      -28.294  -1.798   7.139  1.00  0.00
ATOM    483  C   PRO    32      -27.169  -2.471   7.908  1.00  0.00
...
ATOM    495  N   VAL    33      -25.978  -2.393   7.325  1.00  0.00
ATOM    496  CA  VAL    33      -24.749  -2.884   7.927  1.00  0.00
ATOM    497  C   VAL    33      -23.841  -1.699   7.661  1.00  0.00
...

If more than one proline is present in this sequence, you can't use CA:PRO neither.

At the end of the property calculation, VEGA shows the ranges, the average value and the standard deviation. If you want exclude the influence of the water in the calculation of dipolar moment, molecular surface, Virtual logP and molecular volume, you can use the -w option.

 

3.12 –n
This switch enables the normalization of atomic coordinates. The geometry center of a single molecule or a complex is moved to the origin of Cartesian axes.

 

3.13 -o[OUTPUT]
With –o parameter, you can specify the name of the output file with or without extension. If the filename doesn’t have any extension, VEGA automatically adds the appropriate one on the basis of the selected output format (see –f option). The most common extension used by VEGA are showed in the following table:

Extension

Type

Add

File Format

.alc T Y Alchemy.

.arc

T

N

Mopac optimized internal coordinates

.car

T

Y

Accelrys CAR file (old and new subformat)

.cor

T

Y

Accelrys CAR file with optimized coordinates

.crd

T

Y

CHARMM

.cssr

T

Y

Cambridge Data File (CSSR)

.csv T Y Surface in CSV format.

.dat

T

Y

Mopac internal coordinates

.dcd

B

Y

CHARMM/NAMD trajectory file

.ene

T

N

Accelrys CHARMm energy file

.ene

T

Y

VEGA interaction energy file

.ent

T

N

PDB

.fas

T

Y

FASTA

.fld T Y Tripos COMFA field

.gro

T

Y

Gromos/Gromacs

.iff

B

Y

Interchange File Format (IFF)

.inf

T

Y

VEGA information file

.inp T Y GAMESS cartesian

.ml2

T

Y

Tripos Sybyl Mol 2

.mol T Y MDL Molfile.

.msf

B

Y

MSI Quanta

.par

T

N

VEGA parameters

.pdb

T

Y

PDB, PDB2, PDBA, PDBF and PDBQ

.psf T Y PSF and PSF X-Plor

.qmc

T

N

QMC (CSSR like format)

.srf

B

Y

Accelrys Quanta surface

.srf

T

Y

Accelrys  Insight surface

.tem

T

N

VEGA template

.wrl

T

Y

VRML (Virtual Reality Markup Language)

.xyz

T

Y

XYZ

Where the column Extension is the file extension, Type is the file type (T = text, B = binary), Add shows if VEGA adds automatically the extension and File Format is the name of file format.
If you execute VEGA without –o parameter, the output is redirected to the console (stdout) or to a special device driver (e.g. PRT: for AmigaDOS). This function is very useful to interface VEGA with another program that can get the input from console. The redirection is possible with text file formats only.

 

3.14 –p[FORCE_FIELD]
This function allows to assign the atom types using a specified force field template. This is the most complex function implemented in VEGA. The first challenge being the creation of an universal language, called ATDL (Atom Type Description Language) able to describe virtually any atom type. For more information about ATDL, see section … VEGA uses the force field template files stored in Data directory with the extension .tem (lowercase). The name of these files must be uppercase, but the argument of –p option is case-insensitive. In order to assign the correct atom types, VEGA uses a multiple step algorithm:

Although these steps are very complex, the total process speed is very high.

 

3.15 –r
This switch removes all hydrogen atoms and it’s very useful to create a PDB file suitable for uploading to molecular databases.

 

3.16 -s[POINTS]
With this parameter you can change the point density of a surface map. POINTS is the number of points per surface unit (Å2). The default value is stored in the prefs file and usually it is set to 10. For more information about surface calculation, please see the -f[FORMAT] option.

 

3.17 -t [PORT NUMBER]
This option allows to change the number of communication port (click here for more information). It's available only in the Win32 version

 

3.18 -w
This switch removes all the water molecules present in an assembly. Please note that VEGA do not find the water molecules by residue names (e.g. HOH, TIP3, etc), but on the basis of connectivity table. This approach is slower but more precise and independent of residue naming.
You can use the -w option in trajectory analysis to neglect the water influence in the evaluation of dipolar moment, molecular surface and Virtual logP.