Software:Protein-DNA Modeling Interface Tutorial

From Honiglab_public

To run the program one requires a topology file (*.top) and a command file (COMFILE), both are described below.

Topology File: The program requires a topology file that describes the topology and force-field of the biomolecule (e.g. protein & DNA) being used. These files are analogous to the *.top *.crg files used by CHARMM. Two topology files are provided here for the AMBER98 force-field: AMBER98.top is the standard AMBER98 force-field (with the exception of improper dihedral terms), and AMBER98_0.5Phosph has the charge of the DNA phosphate groups scaled as described in the Siggers & Honig (200X). This topology file is read using an environment variable (TROLLTOP). It is easiest if this is set in your .tcshrc (or equivalent) file:

setenv TROLLTOP /foo/AMBER98.top

Command File (COMFILE): The program is run using a Command File (COMFILE) which describes all the input parameters and input files. The COMFILE is a plain text file listing all the arguments – one per line. The arguments needed in the COMFILE and short explanations are listed below; however, running the program with the option –help (>intf_model.exe –help) will similarly list the options with a short description of each. Comment lines can be indicated with preceding ‘//’ characters. For several of the arguments extra details on file formats are provided below.

COMFILE arguments/syntax:

// Comment Line
-i PDB.file             -Input template protein-DNA complex, see longer explanation below
-o OUTPUT.pdb           -Output filename for modeled structure
-res RESFILE            -File describing which residues to model and their Identities
-lib  SC_LIB            -Protein sidechain rotamer library
-prot_lib_type TOR      -Type of rotamer library being used for protein sidechains.                                           Options: ‘TOR’ or ‘XYZ’   for torsional	or cartesian.
-pol  POL_HYD           -Description of polar hydrogens
-rohs_eps  2.0          -Near field dielectic permittivity value 
                         (Sigmoidal function, see description in paper)
-hbond -2.0 	         -Maximum value of an optimal hydrogen bond
-scp 0.9                -VDW softening parameter (see description in paper)
-cons		         -Will take sidechain bond lengths and bond angles 
                         from the input PDB file when residue identity is
                         not changed. When this is not used Standard values
                         from Charmm22 are used
-init 20                -Number of initial configurations to try
-cycles 10              -Number of cycles to run per initial configuration 
                         (see description in paper).
-DNA_XYZ_Lib_RotNum 50  -Number of Nucleotide rotamers to construct for 
                         each nucleotide being modeled (see paper for description).  


Additional parameter/file information:

 -i PDB.file: This PDB file needs to be formatted to agree with the syntax in the topology 
                    file. A perl script is included here to format a standard PDB file to agree 
                    with the two AMBER98  topology files provide. This script can be run as shown
                    below. Currently parameters for metal ions are not included, therefore PDB 
                    atom lines, such as for Zn atoms in Zinc-finger proteins, need to be manually
                    removed before running the script.  

> perl pdb_to_Amber.pl –i FOO.pdb > FOO_converted.pdb


-res RESLIST: The residue list (RESLIST) file describes which sidechains and nucleotides 
                    will be re-modeled. The syntax of the file is as follows

LINE 1: subset description of residues to model
LINES 2-N: identity of the residues indicated in LINE1
LINE N+1-M: Constraint lines to constrain rotamer sampling.

Example RESLIST file: (chain A and range 10-12) or (chain B and range 1-3) or (chain C and range 5-7)

ASP A  10
LEU A  11
TRP A  12
CYS A  13
GUA B 1  CYT A 7  
THY B 2  ADE A 6
THY B 3  ADE A 5
CON 1.0 :chain B or chain C

Line 1 indicates that residues 10-12 from chain A, residues 1-3 from chain B and residues 5-7 from chain C should all be modeled. Residues do not need to be contiguous however, to select residues 1 and 3 from chain A one would write: chain A and (range 1 or range 3). The following lines (2-8) indicate the residue identities, protein sidechains are written one per line while nucleotides a paired up with their base-pairing partner nucleotides as indicated. Only residues indicated in line 1 will be modeled, therefore, CYS A 13 (line 5) will not be modeled. The constraint line(line 9) indicates that for chains B and chain C only rotamers with an rmsd <= 1.0 angstroms with the crystal structure PDB will be allowed. This line only makes sense for sidechain residues where the identity doesn’t change and for nucleotide rotamers, where the RMSD is calculated using the sugar heavy atoms and the N1 (pyrimidine bases) or N9 (purine bases).

-lib  SC_LIB:  A copy of the large torsional rotamer derived fro the cartesian library of
               Xiang & Honig (2001) JMB 311:421 is provided. As well, a smaller cartesian (XYZ)
               version of Xiang & Honig library is provided. The syntax of the rotamer libraries
               needs to follow that of these files and the rotamer type (TOR or XYZ) needs to be
               indicated with the –prot_lib_type argument. 
 -pol pol: This file contains one line that indicates which hydrogren atoms should be
                 treated as rotatable. Rotatable hydrogens (e.g. CYS HG1) will be rotationally 
                 sampled during the modeling (i.e. when selecting the lowest energy rotamer, for 
                 each CYS rotamer, the CA-CB-SG-HG1 dihedral angle will be sampled at 15 degree
                 increments). These rotatable hydrogens are normally: CYS, THR, SER, TYR. Two
                 files are included here, the file pol indicates the CYS,THR,SER and TYR
                 atoms (with the correct atom names) should be treated as rotatable, and nopol
                 is a dummy file indicating that no hydrogens should be treated as rotatable. 

Running the program:

>intf_model.exe –i comfile
Views