The scripts and input files that accompany this demo can be found in the
demos/public directory of the Rosetta weekly releases.
KEYWORDS: LIGANDS UTILITIES
This tutorial assumes a unix-style command line (or cygwin on windows).
In general the ligand atoms are marked by "HETATM", but check the ligand with pymol after you've grepped out the HETATM lines:
For this step you need to add on hydrogens. We use avogadro because its open source, but you can choose other software to place hydrogens: http://avogadro.openmolecules.net (this example is with 1.0.3 on mac)
grep HETATM starting_inputs/cel5A_glucan.pdb > starting_inputs/cel5A_lig_noH.pdb
Open in avogadro, choose Build--> add hydrogens, then save the molecule in MDL SDfile format cel5A_lig.mol
Alternatively, you can save the molecule in PDB format and convert the PDB file into a mol file using babel (http://openbabel.org/) as follows:
Or the third alternative is to open the PDB file with pymol and save the molecule with a .mol extension to force pymol to save as a mol format.
babel -ipdb starting_inputs/cel5A_lig.pdb -omol > starting_inputs/cel5A_lig.mol
The next step is to create a params file for rosetta. Params file contains the internal coordinates of atoms, connectivity, charge of each atom, rosetta atom type. Most importantly for this demo it contains PROTON_CHI lines which specify the proton atoms that rosetta will simple around. -n specifies the name of the ligand in rosetta
The output params file is called cel.params. The code will sample proton chi's at the explicit values stated in the params file, and then perform a minimization on those chi's. You shouldn't have to change the params, but if you want to add sampling explicitly you can add more angles to sample. A sample proton chi is:
python src/python/apps/public/molfile_to_params.py starting_inputs/cel5A_lig.mol -n cel
CHI 1 C1 C2 O1 H8 PROTON_CHI 1 SAMPLES 3 60 -60 180 EXTRA 0
Pull out the protein without ligand:
Add back in the ligand pdb from molfile to params:
grep ATOM starting_inputs/cel5A_glucan.pdb > rosetta_inputs/cel5A_input.pdb
cat cel_0001.pdb >> rosetta_inputs/cel5A_input.pdb
Now we can run the enzdes app in rosetta with minimal flags. This optimizes proton chis on the ligand while also repacking sidechains:
for example, using the provided inputs you can run: (where
path/to/EnzdesFixBB.[platform][compiler][mode] -s rosetta_inputs/cel5A_input.pdb -extra_res_fa cel.params -database path/to/minirosetta_database/ -out:file:o cel5A_score.out -nstruct 1 -detect_design_interface -cut1 0.0 -cut2 0.0 -cut3 10.0 -cut4 12.0 -minimize_ligand true
$> $ROSETTA3/bin/EnzdesFixBB.default.linuxgccrelease -s rosetta_inputs/cel5A_input.pdb -extra_res_fa rosetta_inputs/cel.params -out:file:o cel5A_score1.out -nstruct 1 -detect_design_interface -cut1 0.0 -cut2 0.0 -cut3 10.0 -cut4 12.0 -minimize_ligand true
Or, this optimizes proton chis on the ligand without repacking sidechains:
Both runs should produce the PDB file cel5A_input__DE_1.pdb and the score file cel5A_score1.out or cel5A_score2.out, which can be placed into the output directory under different names, e.g. cel5A_output_nopack.pdb or cel5A_output_w_repack.pdb
$> $ROSETTA3/bin/EnzdesFixBB.default.linuxgccrelease -s rosetta_inputs/cel5A_input.pdb -extra_res_fa rosetta_inputs/cel.params -out:file:o cel5A_score2.out -nstruct 1 -detect_design_interface -cut1 0.0 -cut2 0.0 -cut3 0 -cut4 0 -minimize_ligand true
-extra_res_fa specifies the params file for new residues types (the glucan in this case) -out:file:o specifies the file name for the enzdes-style score output. This contains extra information about the output design, like packing, interface energy, and many more -nstruct 1 specifies one run and one output pdb; the packing is stochastic so for more sampling use a higher nstruct. 10-100 is recommended for most ligands. -detect_design_interface tells rosetta to set up the designable and packable residues (in a packer task) based on distance from the ligand. Distances are calculated from every ligand heavy atom to the CA of amino acids. [default values in brackets] -cut1: CA less than cut1 is designed  -cut2: CA between cut1 and cut2, with CA --> CB vector pointing towards ligand, is designed  -cut3: CA less than cut3 is re-packed  -cut4: CA between cut3 and cut4 with CA --> CB vector pointing towards ligand, is re-packed  -minimize_ligand true allow ligand torsions to minimize
If desired, use a more complete energy function and more sampling as in this example flags file Recommended full flags for a more careful run:
This setup can also be used for full enzyme design This run is very close to a full design of the active site. For a full design just change cut1 and cut2, e.g.
-cut1 6 -cut2 8