Hi, everyone,
I am struggling with generating params files to represent a new linear polymer in rosetta. I am using the general idea employed for the already implemented polymers (such as CAAs, NCAAs or nucleic acids) in which you define a params file containing the lower and upper atom connect entries. I also have set up the internal coordinate entries for these upper and lower connections, which are needed to generate meaningful connections between the polymer's residues. Additionally, I've created upper and lower terminus variants as separated params files, changing their atoms and charges and setting their lower or upper connections accordingly as none.
My problem is when I try to read the PDB file containing a ten residues version of the polymer. I am reading all the (three) params files with -extra_res_fa option; however, it complains about:
ERROR: Unable to find desired residue 'PET' with variant 'UPPER_TERMINUS_VARIANT'. Attempted to add target variant(s) to ResidueType using both ResidueType base name 'PET' and base ResidueType. Was attempting to add new variant type 'UPPER_TERMINUS_VARIANT'
I tried to find how Rosetta interprets the upper and lower terminus variants, but I could not find much information about this. I looked at the database and found the param files for protein caps (i.e., ACE and NME residues) and, copying that idea, I set up my upper and lower variants as different cap residues; however, the error continued to be the same.
Is there any way I can make rosetta aware of the polymer residue's upper and lower terminus variants? or is my problem laying somewhere else entirely?
I am attaching all my params files.
(I apologise for putting this in the Non-Canonical Peptides category; I could not find anything more adequate).
The problem is that you are reusing the same residue name thrice, even if different, yet what the system expects is a single residue that has patch definitions.
Like terminal residues, patches are bizarre... until you have made one: then they make sense. I wrote about them here but that's not needed as in summary, unlike residuetypes, they stack (for the category), so you can redeclare the upper and lower patches specifically for your residues and the copypaste text mangling is straightforward.
Ok, I think that was a big step toward adding the variants. I have created the two patches files needed (i.e., for the upper and lower variants). Then, I loaded them with the -extra_patch_fa option, which was recognized as such. However, when trying to read a 10-long version of the polymer, I am getting a segmentation fault error:
I then installed Rosetta in debug mode to see if I could see some extra information about the line the error was originating:
It says something about the internal coordinate, but I am not sure what exactly could be wrong. Maybe I am missing something obvious about the patches files?
I am putting my patches file here, as well as the typical ROSETTA_CRASH.log file.
Okay. I gave your snippets a spin in PyRosetta (same as Scripts, pasted below just for posterity), but I was confused by the fact that with the autotermini off (same as the command line argument -use_truncated_termini false) and relying on the original, the output is a mess even the bonds are optimised:
I suspect the connection might be declared back to front or something peculiar. So that is a way bigger issue than bad termini.
Namely, your PDB may read fine albeit with odd termini, but will blow up.
Also, for the sake of formality, I would expect the oxygen linking the two residues not be part of the therephthate, but of the ethylene glycol as that is where it came from.
So, would you say this is an issue with the definition of the internal coordinates? It seems that inter-monomer connections are the only thing distorted.
Yes. When -use_truncated_termini is set to true it circumvents your issue —good to verify the main params file. Even if it does not affect each residue it will blow up the structure if the backbone is altered. Feel free to try it.
I looked at your file and it does indeed seem like your connects are wired backwards:
There may be other issues... Therefore, maybe it might be checking in a simpler system whether it is correct as you are hoping?