# Enzyme design with variable CST blocks possible?

7 posts / 0 new
Enzyme design with variable CST blocks possible?
#1

I am wondering if it is possible to use the enzyme design algorithm with a cst file containing variable cst blocks used for matching. I am using the following cst file for my matching:

 #block 1 - Malimide-Cys CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: C1 C2 H1 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_name: SG CB CA TEMPLATE:: ATOM_MAP: 2 residue1: C CONSTRAINT:: distanceAB: 1.85 0.20 240.00 1 1 CONSTRAINT:: angle_A: 105.00 10.00 58.00 360.00 1 CONSTRAINT:: angle_B: 95.00 15.00 34.00 360.00 2 CONSTRAINT:: torsion_A: 120.00 8.00 0.50 360.00 1 CONSTRAINT:: torsion_B: 180.00 10.00 0.30 120.00 1 CONSTRAINT:: torsion_AB: 180.00 10.00 0.30 120.00 2 CST::END #block 2 - CN1 H-bond VARIABLE_CST::BEGIN #sub-block - amide sidechain donor CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: N3 FE1 S2 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_type: NH2O TEMPLATE:: ATOM_MAP: 2 residue1: NQ CONSTRAINT:: distanceAB: 3.10 0.20 60.00 1 1 CONSTRAINT:: angle_A: 170.00 10.00 15.00 360.00 2 CONSTRAINT:: angle_B: 120.00 10.00 40.00 360.00 2 CONSTRAINT:: torsion_A: 210.00 166.15 0.00 360.00 6 CONSTRAINT:: torsion_B: 0.00 10.00 40.00 180.00 1 CONSTRAINT:: torsion_AB: 140.00 166.15 0.00 360.00 6 CST::END #sub-block - backbone amide donor CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: N3 FE1 S2 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_name: N CA H TEMPLATE:: ATOM_MAP: 2 is_backbone CONSTRAINT:: distanceAB: 3.10 0.20 60.00 1 2 CONSTRAINT:: angle_A: 160.00 20.00 15.00 360.00 3 CONSTRAINT:: angle_B: 120.00 10.00 40.00 360.00 2 CONSTRAINT:: torsion_A: 150.00 166.15 0.00 360.00 6 CONSTRAINT:: torsion_B: 0.00 10.00 40.00 360.00 1 CONSTRAINT:: torsion_AB: 140.00 166.15 0.00 360.00 6 CST::END VARIABLE_CST::END #block 3 - CN2 H-bond VARIABLE_CST::BEGIN #sub-block - amide sidechain donor CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: N4 FE2 S1 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_type: NH2O TEMPLATE:: ATOM_MAP: 2 residue1: NQ CONSTRAINT:: distanceAB: 3.10 0.20 60.00 1 1 CONSTRAINT:: angle_A: 170.00 10.00 15.00 360.00 2 CONSTRAINT:: angle_B: 120.00 10.00 40.00 360.00 2 CONSTRAINT:: torsion_A: 210.00 166.15 0.00 360.00 6 CONSTRAINT:: torsion_B: 0.00 10.00 40.00 180.00 1 CONSTRAINT:: torsion_AB: 140.00 166.15 0.00 360.00 6 CST::END #sub-block - backbone amide donor CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: N4 FE2 S1 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_name: N CA H TEMPLATE:: ATOM_MAP: 2 is_backbone CONSTRAINT:: distanceAB: 3.10 0.20 60.00 1 2 CONSTRAINT:: angle_A: 160.00 20.00 15.00 360.00 3 CONSTRAINT:: angle_B: 120.00 10.00 40.00 360.00 2 CONSTRAINT:: torsion_A: 150.00 166.15 0.00 360.00 6 CONSTRAINT:: torsion_B: 0.00 10.00 40.00 360.00 1 CONSTRAINT:: torsion_AB: 140.00 166.15 0.00 360.00 6 CST::END VARIABLE_CST::END #block 4 - CN1-Lys/Arg h-bond VARIABLE_CST::BEGIN #sub-block - lysine N CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: N3 FE1 S2 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_name: NZ CE 1HZ TEMPLATE:: ATOM_MAP: 2 residue1: K CONSTRAINT:: distanceAB: 2.80 0.20 50.00 1 1 CONSTRAINT:: angle_A: 110.00 15.00 10.00 360.00 2 CONSTRAINT:: angle_B: 109.50 10.00 40.00 360.00 2 CONSTRAINT:: torsion_A: 210.00 166.15 0.00 360.00 6 CONSTRAINT:: torsion_B: 0.00 10.00 40.00 360.00 1 CONSTRAINT:: torsion_AB: 140.00 166.15 0.00 360.00 6 CST::END #sub-block - arginine Ne CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: N3 FE1 S2 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_name: NE CD HE TEMPLATE:: ATOM_MAP: 2 residue1: R CONSTRAINT:: distanceAB: 2.80 0.20 50.00 1 1 CONSTRAINT:: angle_A: 110.00 10.00 10.00 360.00 2 CONSTRAINT:: angle_B: 120.00 10.00 40.00 360.00 2 CONSTRAINT:: torsion_A: 210.00 166.15 0.00 360.00 6 CONSTRAINT:: torsion_B: 0.00 10.00 40.00 180.00 1 CONSTRAINT:: torsion_AB: 140.00 166.15 0.00 360.00 6 CST::END #sub-block - arginine Nh CST::BEGIN TEMPLATE:: ATOM_MAP: 1 atom_name: N3 FE1 S2 TEMPLATE:: ATOM_MAP: 1 residue3: HAS TEMPLATE:: ATOM_MAP: 2 atom_name: NH1 CZ 1HH1 TEMPLATE:: ATOM_MAP: 2 residue1: R CONSTRAINT:: distanceAB: 2.80 0.20 50.00 1 1 CONSTRAINT:: angle_A: 110.00 10.00 10.00 360.00 2 CONSTRAINT:: angle_B: 120.00 10.00 40.00 360.00 2 CONSTRAINT:: torsion_A: 210.00 166.15 0.00 360.00 6 CONSTRAINT:: torsion_B: 0.00 10.00 40.00 180.00 1 CONSTRAINT:: torsion_AB: 140.00 166.15 0.00 360.00 6 CST::END VARIABLE_CST::END

But when I use the constraint file for enzyme design, I get an error:

 ... protocols.toolbox.match_enzdes_util.EnzConstraintIO: read enzyme constraints from ../HAS_C_NQbb_NQbb_KR_1.cst ... done, 4 cst blocks were read. protocols.toolbox.match_enzdes_util.EnzConstraintIO: Generating constraints for pose... protocols.toolbox.match_enzdes_util.EnzConstraintParameters: for block 1, 6 newly generated constraints were added protocols.toolbox.match_enzdes_util.EnzConstraintIO: checking cst data consistency for block 1... done protocols.toolbox.match_enzdes_util.EnzConstraintIO: Cst Block 1done... protocols.toolbox.match_enzdes_util.EnzConstraintParameters: for block 3, 4 newly generated constraints were added protocols.toolbox.match_enzdes_util.EnzConstraintIO: checking cst data consistency for block 2... done protocols.toolbox.match_enzdes_util.EnzConstraintIO: Cst Block 2done... param_cache_.size:4 cst_block:5 cst_block should be smaller or equal to param_cache_.size() ERROR: cst_block <= param_cache_.size() ERROR:: Exit from: src/protocols/toolbox/match_enzdes_util/EnzdesCstCache.cc line: 83

It runs fine when I only include the 4 blocks in the cst that are relevant to the particular site I'm designing, but it seems unreasonable to have to do this for every single match result. Is there any way to run the enzyme design program with the variable cst blocks I used in the match program?

Thanks,
-Igor

Post Situation:
Thu, 2011-10-06 11:24
petrikigor

hey igor,

the VARIABLE_CST thing is an unofficial feature at the moment that only the matcher supports, so even if it does sound unreasonable to make a cst file for every particular type of site for the enzyme_design app, that's unfortunately what you have to do at the moment. looks like in your case that will be 2*2*3=12 cstfiles.
i'll try to code that feature into the enzyme design app at some point in the future, and can let you know once that happens.

one question: how did you become aware of this feature? i don't think I wrote documentation about it.

1) you declared everything to be covalent, but from the geometry specified it seems that only the first interaction, the Cys, is actually a covalent bond. so i'd switch the other blocks to non-covalent.

2) for the 4th block, you can specify the arginine NE and NH in the same block by saying
TEMPLATE:: ATOM_MAP: 2 atom_name: NE CD HE
TEMPLATE:: ATOM_MAP: 2 atom_name: NH1 CZ 1HH1
TEMPLATE:: ATOM_MAP: 2 residue1: R

that way you only have to generate 2*2*2=8 cst files.

3) how's the match/enzyme design code working for you in general?

cheers,
florian

Thu, 2011-10-06 14:36
frichter

Hi Florian,

Thanks for the quick and thorough response! First let me answer question (1), yes, the rest should be non-covalent; that's what I get for copy/pasting. :P

Let me also thank you for taking the time to write thorough user manual entries and tutorials about these programs. They were very helpful. In general, the programs are working quite well, when I set up my files properly. I had some instances at first where I allowed to much flexibility and ran out of RAM in the matcher. :P But in general it is going as smoothly as a learning experience can.

As for the VARIABLE_CST blocks, they were used in one of the integration tests in Rosetta 3.2.

Thanks for the tip on the multiple "atom_name" definitions. I do have a question about that though - does that work in enzyme design as well, and if so, how does the program know which of the two atom sets to use for the constraint?

Thanks,
-Igor

EDIT: P.S. I am also wondering how something like this can be done:

I want to define a constraint like this:
 TEMPLATE:: ATOM_MAP: 1 atom_name: N3 NE2 CD TEMPLATE:: ATOM_MAP: 2 atom_name: NE CD HE

Where N3 is on the ligand and NE2 and CD are atoms on the sidechain of an upstream match. Is there any way to do something like this? Basically, to have a secondary constraint defined through a ligand atom?

Thanks.

Fri, 2011-10-07 16:26
petrikigor

hey igor,

multiple atom name definitions work in both matcher and enzdes. in enzdes, the code will always evaluate the constraint score for all possibilities, but only apply the lowest observed constraint penalty. i.e. whichever one of the possible atoms is best suited to satisfy the constraint will be enforced.

regarding your other question, sorry to disappoint you again, but that's not really possible. the reason is that if you do this, you're defining a constraint that includes three different residues (assuming that the NE2 and CD of ATOM_MAP: 1 records are on a different upstream residue than the NE and CD atoms of the ATOM_MAP: 2 records), and if you have a constraint that includes 3 different residues your energy function isn't rotamer pairwise decomposable anymore. which in turn means that rotamer packing/design calculations couldn't be carried out efficiently anymore.

also, have you played around with secondary matching for some of your less well defined constraints? my prediction is that you'll get more matches.

cheers,
florian

Fri, 2011-10-07 18:10
frichter

Hi Florian,

Thanks again. I didn't think the secondary match via a ligand atom would be possible, but I thought I'd ask, just in case. I appreciate the detailed explanation.

Also, thanks for reminding me about secondary constraints and that they are not just for defining constraints with other residues. Remembering this, I think also solves a problem I was struggling with for another task - how to match a ligand composed of less than three atoms. I was concerned about the binning of orientations and how many torsional degrees of freedom I'd need to allow, because the position of the third dummy atom doesn't matter for me, but with secondary constraints I think that problem disappears!

Thanks,
- Igor

P.S. Now I have another question - is it possible to run matcher and enzyme design to design a ligand binding site at a protein interface. I.e. I have two chains in a crystal structure and I want to bind a ligand at the interface. I don't need to reposition either of the protein monomers with respect to each other (which, as far as I understand can't be done yet), just find a binding site and redesign the sidechains of both monomers as they are positioned.

EDIT: P.P.S. Does resname "HIS" in the cst file imply both HIS and HIS_D to the matcher/enzdes, or do I have to specify HIS_D separately? Or put another way, would this work to specify that both deprotonated Ne and Nd are allowed to satisfy this constraint:

 TEMPLATE:: ATOM_MAP: 2 atom_name: Nhis, TEMPLATE:: ATOM_MAP: 2 residue3: HIS

Or do I have to do something different?

Sat, 2011-10-08 09:32
petrikigor

hey,
re matching a ligand with less than three atoms
yes, with secondary matching this problem would disappear. if you use classic matching, you should set the euler bin size of the matcher to 360 (option -match::euler_bin_size ), and then it should be fine if you set the torsion_A and torsion_AB and angle_A parameters to one value only. this works with 1 atom ligands (i.e. metals), i'm not sure how it would work with two atom ligands

re matching/enzdes at interface.
that's totally possible, as long as the ligand is the last residue in the pdb file (which is where the matcher will put it). for matching, you have to specify the match positions in absolute numbering though, i.e. if chain A and B both have 100 residues, and residue 40 of chain B should be considered by the matcher, this would be residue 140 in the match positions file.

re His in cstfile
HIS specifies both HIS and HIS_D, it's actually impossible at the moment to only specify one of them
Nhis specifies NE2 in HIS_D and ND1 in HIS, however the constraint is applied to the atom that is of type Nhis at the time the constraint is setup. i.e. sometimes the wrong protonation state could be set in the packer. at some point i'll get around to fixing that bug.

cheers,
florian

Tue, 2011-10-11 13:43
frichter

Florian,

Thanks again for the helpful responses.

In regards to the HIS issue, one of my colleagues has actually been having a problem. The "ligand" that he is trying to position is constrained with interactions to various HIS residues. The matcher seems to give a reasonable result, with some HIS using Nd and some Ne. But when that output is used as input to the enzyme design application, some of the nitrogens that are supposed to be coordinating the "ligand" become protonated. It'd be really nice if there were a way to specify a different residue for a different protonation state of HIS.

In any case, thanks for your help.

-Igor

Sun, 2011-10-16 07:48
petrikigor