I am doing docking between two proteins. Let's say protein A and protein B. Protein B has a modified amino acid composed of serine, phosphopantetheine, and a long acyl part. Before docking, I generated parameter file (.params) for the modified amino acid by molfile_to_params.py with -a flag. I also generated conformers for this modified amino acid (if we take it as a ligand). Now, I was wondering whether I should do protein-protein or protein-ligand docking. Does it make sense to first do protein-protein docking and then protein-ligand docking?
It is noteworthy to mention that I just did protein-protein docking and the output structures are not much different from each other (standard deviation close to zero) and even from the input structure.
Any suggestion is appreciated.
So you do not like your output and you are moddelling a pre-activation complex of an ACP and fab enzyme, yes?
I would say ligand-protein first as using ACPs from different species seems to not have a deleterious effect on catalysis IIRC, so the protein-protein interaction is probably very weak.
The transition state would have the phosphopantetheine cysteine and the enzyme's cysteine forming a tetraheydral hemi-thioacetyal thinggy —totally modellable as amino acids can have a third connect, albeit that it need to be declared in the PDB block with the LINK record. So you'd want your pre-activation complex to be close to that as possible.
For the pre-activation complex I would say do a docking of the transferred acyl (or physiological acyl substrate if you are trying to engineer specificity towards something else) to see where the atoms roughly ought to go. I've docked an acyl-CoA and an enzyme in a few steps into one of the anabolic enzymes —dont remember which, fabB?— and it was very clear cut because the keto groups of the acyls seemed to have a clear binding mode for substrates.
And then do your protein protein docking with constraints for the omega carbon and the sulfur —sparks fly below 3 Å, but if you push them too much up that activation hill, you get odd stuff happening unfortunately, so I'd say a harmomic constraint with a very low sigma (say 0.1) and a mean that is 3.5 Å works best. One thing to be aware of it the deprotonation of the catalytic cysteine (residue CYZ), which in turn requires a specific tautomer of histidine (if the base is a histidine).
I really appreciate your response.
Your guess is completely right. I am modeling the complex of an ACP and fab enzyme to change the specificity of the enzyme. But my system is slightly different from yours: In my enzyme, D, N, H, and E compose the catalytic network.
Your answer "The transition state would have the phosphopantetheine cysteine and the enzyme's cysteine forming a tetraheydral hemi-thioacetyal thinggy —totally modellable as amino acids can have a third connect, albeit that it needs to be declared in the PDB block with the LINK record. So you'd want your pre-activation complex to be close to that as possible." is not completely clear to me. In my ACP, serine is attached to phosphopantetheine arm, not cysteine.
What I understood from your response is I need to do first protein-ligand docking and then protein-protein docking with harmonic constraint, but I did not realize the details you mentioned.
1. What do you mean by catalysis IIRC? Why does ACP have a deleterious effect on catalysis IIRC?
2. In protein-ligand docking, should ACP and the modified amino acid (serine plus phosphopantetheine plus acyl) have different chain id?
3. What do you mean by docking the transferred acyl or physiological acyl?
4. What is the rationale behind the constraint (the omega carbon and the sulfur —sparks fly below 3 Å)? In my case, since the acyl is long, the distance between the sulfur in the phosphopantetheine arm and the omega carbon is between 10-14 A (depending on the chain length). Therefore, 3A does not make sense. Instead of that constraint, does it make sense to have a distance constraint between the carbonyl oxygen of acyl and my catalytic residue of the enzyme to have an electron jump happen in both docking steps?
5. How can I make sure the specific tautomer of histidine is right in my complex?
I would be most grateful if you could help me with the detailed steps you have done in your work or refer me to that paper.
You are totally right, I made a silly mistake and my answer is completely invalid, hence the utter confusion.
Only the synthase (the malonyl transfer one that does the decarboxylative Claisen condensation) has a cysteine to which one of the substrates gets transferred over to, the rest do not form a covalent transition state —i.e. the acyl does not get passed onto the enzyme and then returned back, wherein the passing over goes through a tetrahedral two sulfurs on a carbon transition state. For some reason I thought this was the default.
Not having a covalent transition state means that my worries about strain in the pre-reaction complex are not applicable. The transition state for the dehydrogenases and dehydratase is just ketone to hydroxylate ion, so on worth modelling unless you are worried that your active site may be affected by a mutation. To clarify my incorrect train of thought, Were it covalent, two atoms would need to get close for "sparks to fly" (bond forming), however this proximity would result in a highly unfavourable screaming Lenard-Jones (attraction-repulsion) term of the force-field, which would have causes issues with the calculations. But this does not apply.
Consequently, the only constraints you need is between the ketone and the acid / conjugate acid and the proton and the base / conjugate base. This however, may have protonation issues —GLU is glutamate, while GLU:protonated (patch) is glutamic acid and getting the wrong one will cause repulsions. I do not know how to declare patches not in pyrosetta and I have been ultra lazy in the past and used a nnca for protonated glutamic acid.
So you could simply do protein-protein docking with constraints without a preceeding step. Do note however, that constraints apply to full-atoms and centroid differently and the centroid residue params (in the database) differ —most amino acids have a weird beta-carbon only as a sidechain. I mention that, because protein-protein docking has a centroid step.
However, if you do protein ligand docking first, you can constrain the position of more atoms and better place the ACP and reduce substantially the freedom of movement of the protein-protein docking (more human time, less CPU time).
"Eats, leaves and shoots" — sorry for my bad punctionation. I meant a constraint for the omega carbon and a separate one for the sulfur of the enzyme's cysteine —not applicable as mentioned.
The omega carbon constraint may be good because from your question I am guessing the acyl was not going in the active site. So a rough constraint (WELL or BOUNDED_HARMONIC) would be good.
I meant using ACP and enzyme from different species may not have any effect on activity despite requiring a protein-protein interaction.
I do not remember the paper, but there is one where wet lab folk did experiments with E. coli ACP and probably something like B. subtilis ACP and there was not a difference in activity of a target E. coli enzyme. I have the feeling it was in vivo only —damn wet lab folk! This could mean that either ACPs from different bacterial phyla are ultra-conserved on their surface or that the ACP–enzyme interface is weak. I do not remember any papers that did DLS or similar to get a K_D for it. Caveat: I was reading about this five years ago as I was worried that an enzyme from a different bacteria phyla may not work in E. coli —I changed jobs subsequently so I don't know if it did! I did not do a sequence alignment of ACPs to see conservation either. But I was baffled by the fact that the scarce info about the location of the Fab pathway suggests it is not membrane bound —odd, right?—, making me really confused about this. But the takehome message is that there is a chance the ACP-enzyme interface is very weak.
Docking is defined with a nifty A_B schema. You can have AC_Z say. But I would keep it simple and have everything of the ACP in a single chain ID. After all this is what PDB or CCP4 recomends for cofactors.
Histidine tautomerisation is weird as its basically treated like a different rotamer, so whereas you can force delta protonation by mutating to HIS_D, this does not guarantee it won't flip back. I actually asked a question myself about it a few months ago.
Hello and thank you very much for the help.
Just to clarify my case, Acyl exists in the active site and my enzyme is FatB.
Could I ask a few more questions?
I realized I need to do docking (either protein-protein docking or both protein-ligand docking and protein-protein docking) with two chains (one for enzyme and one for the whole Acyl-ACP) with constraint/constraints.
1. You are talking about two constraints (the ketone and the acid/conjugate acid and the proton and the base/conjugate base). Can you clarify them? I think what you are talking about is different from what I have in my mind. I am thinking of one distance constraint between OD2 of ASP (catalytic residue) and Carbonyl oxygen of the Acyl part? Does this constraint make sense? if so, I want this distance to be below 3A(<=3A) to have an electron jump happen. If I define it as below, it will keep it as exactly 3A. How can I modify it to get exactly what I want(<=3A)?
AtomPair OD2 281A O6 36B HARMONIC 3 0.1
2. Thank you for mentioning the "protein-protein docking has a centroid step". As far as I know, the difference between full-atoms and centroid constraints makes me to define -constraints:cst_file , -constraints:cst_fa_file, -constraints:cst_fa_weight, and -constraints:cst_weight. But I am not sure about the centroid residue params. The ABC.params file I generated for modified residue (with -a flag , because it is a residue in my protein) is applicable to the full atom (-extra_res_fa ABC.params), do I need to add another option for parameters in centroid mode? Also, do I have to consider the centroid model in protein-ligand docking?
3. I did not find wet lab folk you mentioned.
I really appreciate your help.
I am not familar with that particular one —google tells me it's a thioesterase. So I am assuming your ASP is a proton donor to the ketone, which gets attacked by a deprotonated water? If so, 'Asp' is aspartate (nucleophile), not aspartic acid (electrophile). This will score poorly and will be repulsed. So having a protonated variant, aspartic acid, which needs to be explicitly declared, is required. If you are using an app tThere's a topology file for protonated variants in the database folder, something like / chemical / residue type sets / full-atom standard / residues / protonation. If I am wrong but the thioester is swapped with the aspartate then it'd be aspartate. If it's something else, the mechanism should tell.
The constraint makes sense. However, I would say that 0.2 standard deviation or having a bounded harmonic should be fine too —hydrogen bonds have acceptor-donor distance are normally between 2.8Å and 3.2 Å.
A full atom topology works as a centroid one.
For amino acids coarse-grain topology has less atoms by having the side chain represented as an atom called CEN. However, the atom type for these is residue specific so making a ligand specific centroid params file is not really done —you could use a rebranded methionine centroid params file, but that would be rather short.
Unfortunately, I could not keep my notes from then, so I cannot check it. As said, it was an in vivo paper, where an exogenous enzyme was active and the authors did not worry about that enzyme having coevolved with a different ACP —I assume there should be many in vivo complementation studies like that— and I did not check if the surface was conserved —https://consurf.tau.ac.il/ is a quick way of checking.
Thank you so much for your explanation.
In my case, ASP acts as a nucleophile, so it is aspartate and I do not need to declare the protonated ASP( ASP_P1.params).
My purpose of defining the constraint (AtomPair OD2 281A C6 36B HARMONIC 3 0.2) is to model the nucleophilic attack of ASP on the thioester carbonyl carbon, not to model hydrogen bond. Is the above constraint still valid for my purpose? There is also a hydrogen bond between the catalytic ASP and catalytic HIS and ASN. I can add those hydrogen bond constraints to my modeling as well if you think it is necessary?
Just one more question: Do you recommend using "docking flexible proteins" with the above constraints in my case?
Thank you again
It's an interesting conversation, so no thankses needed.
Ah, I follow. So, like the cysteine in the synthase at the beginning of this thread, pushing the attacking atom close to model right when "sparks fly" can have some weird effects due to the strain of the restraint. So I'd say AtomPair OD2 281A C6 36B HARMONIC 3.5 0.2 at first and only during refinement of the docked complex you can push it.
Keeping the active site residues restrained is not at all a bad idea —after all, where they to move, one'd repeat it—, however, histidine and asparagine may be the wrong way round in crystal structures, so it'd require some manual inspection to be certain that they are flipped correctly.
Protein protein docking makes sense, but I'll think aloud once more though.
So if it is a ping-pong reaction there would be four steps:
Step 5 is the easiest to model, followed by 3. If you are after making an exogenous ACP more native, hence the protein-protein then 1 is the best model... However...If you are designing specificity for a different ligand then #3 would be the better starting choice —without worrying about ACP (which without acyl group may even bind weakly). However, it assumes that the mechanism is correct —playing devil's advocate, say, why isn't your mechanism simply protonation of thioester's oxygen (via a protonated aspartic acid residue) and deprotonation of water via a histidine whose other ring nitrogen is protonated and hydrogen bonding with another aspartate/glumatate or maybe asparagine/glutamine. In which case the protein-protein docking is still better.
EDIT: the word flexible baffled me and I just realised a possible misunderstanding. I don't think you meant to refer to it —I mean it did not even cross my mind—, but just in case flexible peptide docking is a for a short peptide in a groove that is calculated from scratch and has nothing to do with this.
What I meant by docking flexible proteins is to consider backbone atoms movement by docking conformational ensemble of proteins explained in this link. (https://www.rosettacommons.org/demos/latest/tutorials/Protein-Protein-Docking/Protein-Protein-Docking). Is it right to use this type of docking in order to model backbone atoms movement? I was wondering whether I can model conformational change in my enzyme by using "docking flexible protein" and make ensemble of conformers for my enzyme?
Regarding your questions about mechanism, it is a ping-pong mechanism, but since ACP interacts with the enzyme, I will work with the first step you mentioned.
I have a question about manual inspection you said about hydrogen bond. The hydrogen bond is between OD1 of D281 and the backbone nitrogen of histidine and asparagine. What type of manual inspection do I need to do?
One more question: I have a Glu in my catalytic network which is protonated. Can I consider the protonated GLU with the distance constraint with the above His without declaring hydrohen or do I need to explicitly use the protonated .params. file and rename the protonated GLU to something else?
Perfect, I just was checking in case of a catacysmic misunderstanding.
The OD1 of D281 to bb H is fine, I meant checking that you have the H-donor···H-acceptor correctly configured in your catalytic network. For Asparagine and glutamine, that is ND/NE = donor, OD/NE = acceptor. For histidine tautomers either ND1 (HIS_D tautomer (aka. HID in Gromacs etc.)) or NE2 (HIS tautomer, aka. HIE) is the donor/acceptor. Most cases should be correct, but vigilance pays off.
Yes, protonated forms are **not** considered by default, you have to explicitly declared it as a protonated NCAA or patch it (Pyrosetta/scripts only). Without it the histidine tautomer will be incorrect as you will have a H-acceptor···H-acceptor scenario. Even with a trapped crystalline water in the way (H-acceptor···H-O-H···H-acceptor) they would need to be 105ª apart for that to work but they would be too close to be happy though.
Thank you very much for the explanation.
I investigated my pdb file to make sure I consider hydrogen bonding correctly.
I add hydrogens by means of Pymol and noticed in my case in the catalytic network, NE2 of His is protonated and ND1 of His is deprotonated. ND1 of His is within 4.5 A of OE1of GLU. As you see in the picture attached, there is no hydrogen bonding between these two atoms (both OE1 of GLU and ND1 of His do not have hydrogens). I want to maintain this distance during docking since I know it helps to stabilize the substrate. I was wondering whether adding this distance constraint makes any problem in docking since it does not show any hydrogen bonding.
I also checked the protonation state of my pdb with Pka program of RoSIE and the output pdb file is the same as Pymol.
Moreover, I obtained the pKa of HIS and GLU from Pka program of ROSIE. They are 4.9 and 3.5, respectively. Therefore, both are deprotonated at physiological pH.