You are here

Problems when working with PTMs - cannot make sugar poses

7 posts / 0 new
Last post
Problems when working with PTMs - cannot make sugar poses
#1

Hello everyone,

I have a protein which has a few PTMs in the form of sugars, and I have therefore started experimenting with sugars in PyRosetta. I have found a very nice guide:

https://graylab.jhu.edu/pyrosetta/downloads/documentation/RosettaCarbohydrate_Tutorial-Demo.pdf

which I have used as a starting point together with the documentation e.g.

https://graylab.jhu.edu/PyRosetta.documentation/pyrosetta.rosetta.core.pose.html?highlight=make_pose_from_saccharide_sequence#pyrosetta.rosetta.core.pose.make_pose_from_saccharide_sequence

I start out by initializing PyRosetta with the appropriate options, and import the specific methods I need:

from pyrosetta import *
pyrosetta.init('-include_sugars','-write_pdb_link_records')

from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose
from pyrosetta.rosetta.core.pose import make_pose_from_saccharide_sequence, pose_from_saccharide_sequence

 

However, whenever I try to use the methods I get runtime errors. If I e.g. enter:

mannose = pose_from_saccharide_sequence('->3)-a-D-Manp')

I get:

ERROR: Unrecognized residue ->3)-alpha-D-Manp when building saccharide sequence ->3)-a-D-Manp

ERROR:: Exit from: /Volumes/MacintoshHD3/benchmark/W.fujii/rosetta.Fujii/_commits_/main/source/src/core/pose/annotated_sequence.cc line: 478

BACKTRACE:
(lot of lines ala 1  ‘rosetta.so                          0x000000012135ff78 rosetta.so + 71184248’)

I got the input for ‘pose_from_saccharide_sequence’ from the guide I linked above, so it should be correct. The documentation states that it:

“Return a Pose from an annotated, linear, IUPAC polysaccharide sequence <sequence> with residue type set name”. I am not an expert on sugars, but as far as I can tell is the input used above ok. I have also tried to input simply ‘galp’, which also does not work.

Can anyone see what I am doing wrong?

My setup is

MacOS 10.14, python 3.6.6, Pyrosetta: PyRosetta4.Release.python36.mac r206

Kind regards,

Martin  

Post Situation: 
Thu, 2019-01-10 07:44
MNP1986

Hello!

Those functions will work for your purposes, but you should also checkout the SimpleGlycosylateMover. If you need to introduce a sequon, the CreateGlycanSequonMover can be used to do this. You can also checkout the CreateSequenceMotifMover in protocols.calc_taskop_movers

Use this to import most of the new glycan classes and utils:

  from rosetta.protocols.carbohdyrates import *

  There are also a few glycan residue selectors in rosetta.protocols.select.residue_selectors

Once you glycosylate the pose at the place you want, you can access the new GlycanTreeSet via pose.glycan_tree_set().  Lots of functions that can be useful to you.  

Finally, for modeling, you should probably use the GlycanTreeModelor.  Its also in protocols.carbohydrates.  It is not published yet, but has been used to do de-novo modeling of glycans with decent results.  Jason Labonte and I have spent the past few years developing it (and associated methods/classes) and it should be published sometime this year (with full documentation on all of these).  However, they are all accessible in PyRosetta, all bug tested, and all benchmarked on high-resolution crystal structures, so go ahead and use them if you want. 

 

As for the IUPAC naming, here is the README we use when defining new common glycans such as man5 or man9 (which can simply be passed to the SimpleGlycosylateMover.  It should help you define the correct nomenclature used. You can find it in PyRosetta in database/chemical/carbohydrates/common_glycans I have included it here. 

Cheers,

Jared Adolf-Bryfogle

File attachments: 
Thu, 2019-01-10 08:20
jadolfbr

Dear jared,

Thank you so much for the very detailed reply - very much appreciated!

For some reason, pose_from_saccharide_sequence worked after restarting my IDE, so perhaps I had initialited it wrong the first time.

I have played a bit around with the different methods and movers, and I am almost able to do everything I want to do. I have however run into into a problem: I cannot get sialic acids to work, even though two are listed in 'codes_to_roots.map':

# Ketononoses (e.g., Sialic Acids)
Neu neuramin  *  2  # D inherent
Kdn keto-deoxy-nonulon  *  2  # D inherent

Sialic acids have nine carbons but only have five carbons in the ring ( https://en.wikipedia.org/wiki/Sialic_acid ). I would therefore expect ''a-Neuf5Ac'' to work but is does not. The Error I get is:

ERROR: Unrecognized residue ->4)-alpha-Neuf:5-Ac when building saccharide sequence a-Neuf5Ac-(2->3)

Can anyone help?

Kind regards,

Martin

EDIT:

it should be 'p' when there are five carbons in the ring, but if I try ''a-Neup5Ac'' I also get an error

EDIT 2:

I got it working using '->8)-a-Neup5Ac-(2->3)' so it was the default value of ->4)... which caused it not to work. Like I said, I am still a rookie with glycans, so not always super on top of the correct settings for the different glycans :)

 

Wed, 2019-01-16 04:47
MNP1986

Hello again,

I have tried to include the option '-auto_detect_glycan_connections' described here: https://www.rosettacommons.org/docs/latest/application_documentation/carbohydrates/WorkingWithGlycans

It, however, gives me the import error:

from pyrosetta import *
pyrosetta.init('-include_sugars','-write_pdb_link_records','-auto_detect_glycan_connections')
Traceback (most recent call last):

  File "<ipython-input-21-fcfdbbb80342>", line 8, in <module>
    pyrosetta.init('-include_sugars','-write_pdb_link_records','-auto_detect_glycan_connections')

  File "/Users/martinpedersen/anaconda3/envs/py36/lib/python3.6/site-packages/pyrosetta-2019.1+release.dbc838b6ae6-py3.6-macosx-10.7-x86_64.egg/pyrosetta/__init__.py", line 160, in init
    assert set_logging_handler in (None, True, False, "interactive", "logging")

AssertionError

 

Thought it would be nice for you to know.

Kind regards,

Martin

 

 

Thu, 2019-01-17 01:55
MNP1986

If you are within a notebook, make sure to restart the kernal any time you re-init. 

Thu, 2019-01-17 08:46
jadolfbr

I usually run my scripts on our cluster with the command: python <script.py> > pyros.log

I am therefore not sure it has to do with the kernel :)

 

Kind regards,

Martin

Mon, 2019-01-21 05:56
MNP1986

Hello again,

I have been able to modifiy our MX structure, albeit with some errors, failures and warnings from GlycanTreeModeler. I would also like to use standard movers like SmallMover and ShearMover to model two hinge regions, which has also proved difficult. I hope it is OK I keep asking the questions in this thread, otherwise I am  happy to make a new one.

Regading GlycanTreeModeler:

After adding the glycosylations using SimpleGlycosylateMover, I call GlycanTreeModeler which succesfully minimise the structure, although I have to call it many times (> 20) to get reasonable minimisation and even after 100 iterations are there problematic tryptophan geomtries/bond lengths (Trp can have mannose added).

The warnings I tipycally get are of the sort:

protocols.carbohydrates.GlycanTreeMinMover: Minimizing from carbohydrate root: 53
protocols.carbohydrates.GlycanTreeMinMover: Minimizing 3 residues
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 13!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 1does not have enough heavy atoms about its connection to define a torsion angle!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 14!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 2does not have enough heavy atoms about its connection to define a torsion angle!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 13!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 1does not have enough heavy atoms about its connection to define a torsion angle!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 14!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 2does not have enough heavy atoms about its connection to define a torsion angle!
protocols.carbohydrates.LinkageConformerMover: Sampling alpha-L-Fucp-(?->6)-beta-D-GlcpNAc- linkage
protocols.carbohydrates.LinkageConformerMover: Sampling conformer 2 which has a population of 0.1813
protocols.carbohydrates.LinkageConformerMover: Complete
...
core.pack.task: Packer task: initialize from command line()
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 4 ASN:N-glycosylated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 23 SER:O-conjugated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 26 SER:O-conjugated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 42 TRP:C-glycosylated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 45 TRP:C-glycosylated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 48 TRP:C-glycosylated

and the failures/errors I get:

protocols.carbohydrates.LinkageConformerMover: Sampling beta-D-Glcp-(?->3)-alpha-L-Fucp- linkage
protocols.carbohydrates.LinkageConformerMover: no conformer found.  Using sugar BB sampler if possible.
protocols.carbohydrates.GlycanTreeSampler: Last mover failed.  Continueing!
protocols.carbohydrates.LinkageConformerMover: Sampling alpha-D-Manp-(?TRP linkage
protocols.carbohydrates.LinkageConformerMover: no conformer found.  Using sugar BB sampler if possible.
protocols.carbohydrates.LinkageConformerMover:  Cannot run sugar BB Sampler on linkage as linkage as a protein-glycan linkage. Skipping
protocols.simple_moves.bb_sampler.SugarBBSampler: 81 81 A  torsion: 2
protocols.simple_moves.bb_sampler.SugarBBSampler: No data for linkage.  Either this is psi and previous residue is not carbohydrate or we do not have a pyranose ring in the previous residue.
protocols.simple_moves.BBDihedralSamplerMover: [ ERROR ] Could not set torsion for resnum 81
protocols.carbohydrates.GlycanTreeSampler: Last mover failed.  Continueing!

To investigate further, I made a simply dummy pose (a pose_from_sequence) which was a straight line and tried to add the glycosilation to that - perhaps some of the problems stem from the glycosylations being added in regions without so much space around it. Yet, I still ran into problems. I have attached a screenshot of wierd Trp geomtry/bond lengths, with the Trp sitting next to Ala so there should be plenty of space.

being unfamiliar with GlycanTreeModeler, I thought that maybe by using standard relaxtion movers like MinMover I could at least fix the tryptophan geometries (with a movemap with set_chi(True)), which in turn would allow  GlycanTreeModeler to fix the errors with the glycans (if any is present, SimpleGlycosylateMover does not produce any warnings). That is, perhaps if I iteratively called GlycanTreeModeler and MinMover then problems would might go away. This proved not to be the case, at least not with 20 iterations - perhaps I could improve them by iterating longer.

Regarding other movers (e.g. SmallMover and ShearMover)

The main problem I have had is that I get zero accepted moves regardless of how high I put kT. This prompted me to log in the log and I can see that I get the error of the type:

core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ;  atomno= 6 rsd= 10 ;  atomno= 8 rsd= 10 ;  atomno= 1 rsd= 56
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ;  atomno= 6 rsd= 10 ;  atomno= 8 rsd= 10 ;  atomno= 1 rsd= 56
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 42 ;  atomno= 6 rsd= 42 ;  atomno= 7 rsd= 42 ;  atomno= 1 rsd= 59
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 45 ;  atomno= 6 rsd= 45 ;  atomno= 7 rsd= 45 ;  atomno= 1 rsd= 60
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 48 ;  atomno= 6 rsd= 48 ;  atomno= 7 rsd= 48 ;  atomno= 1 rsd= 61

 

Interestingly, it is core.conformation.carbohydrates.util which outputs the error, but the errors are printed after GlycanTreeModeler has printed the summary of its run e.g.

protocols.carbohydrates.GlycanTreeModeler: Start- : 13198.6
protocols.carbohydrates.GlycanTreeModeler: Pre  - : 790.491
protocols.carbohydrates.GlycanTreeModeler: Post - : 829.677
protocols.carbohydrates.GlycanTreeModeler: Final- : 790.491
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 4 ;  atomno= 6 rsd= 4 ;  atomno= 8 rsd= 4 ;  atomno= 1 rsd= 53
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ;  atomno= 6 rsd= 10 ;  atomno= 8 rsd= 10 ;  atomno= 1 rsd= 56
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 42 ;  atomno= 6 rsd= 42 ;  atomno= 7 rsd= 42 ;  atomno= 1 rsd= 59
...

One thing I learned is that if I output a pose (pose.dump_pdb) and reload it again, it get the following warnings which are likely connected to the AtomID errors:

core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc53 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Gal54 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Neu55 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc56 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
...
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 53.  Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 56.  Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 59.  Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 60.  Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 61.  Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-L-Fucp 62.  Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-L-Fucp 64.  Returning BOGUS ID instead.

 

In conclusion:

I get both warning, error and failure flags whenever I try to add glycosylations. If I furthermore try to use other movers to perturb the backbone in hinge regions, no moves are accepted. Here is a dummy version of the script where I add the glycosylation to a AA sequence I just cooked up, in case it has something to do with the way I do it:

from pyrosetta import *
pyrosetta.init('-include_sugars','-write_pdb_link_records')#,'-auto_detect_glycan_connections')

#%%
from pyrosetta.rosetta.protocols import relax
from pyrosetta.rosetta.protocols.simple_moves import SmallMover, ShearMover
from pyrosetta.rosetta.protocols.minimization_packing import MinMover, PackRotamersMover

from pyrosetta.rosetta.core.pack.task import TaskFactory

from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose
from pyrosetta.rosetta.core.pose import make_pose_from_saccharide_sequence, \
                                        pose_from_saccharide_sequence

from pyrosetta.rosetta.protocols.carbohydrates import *
from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose, glycosylate_pose_by_file

from pyrosetta.rosetta.utility import vector1_std_string as vstr
sugar_bb = pyrosetta.rosetta.core.scoring.ScoreType.sugar_bb

#%%

import os, glob
import numpy as np
from timeit import default_timer as timer
import datetime

#%%

path = os.path.abspath('.') + '/'
os.chdir(path)


sequence = 'NWS'*4+'PPPP'*2+'NWS'*4+'PPPP'*2+'AWA'*4

pose = pose_from_sequence(sequence)
#pose.dump_pdb('NoPTM.pdb')

#%%

string1 = '->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->6)-'+\
         '[->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->3)]-' +\
         '[b-D-GlcpNAc-(1->4)]-b-D-Manp-(1->4)-b-D-GlcpNAc-(1->4)-'+\
         '[a-L-Fucp-(1->6)]-b-D-GlcpNAc'

string2 = 'b-D-Glcp-(1->3)-a-L-Fucp'

string3 = '->4)-a-D-Manp'

#%%

string1 = '->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->6)-'+\
         '[->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->3)]-' +\
         '[b-D-GlcpNAc-(1->4)]-b-D-Manp-(1->4)-b-D-GlcpNAc-(1->4)-'+\
         '[a-L-Fucp-(1->6)]-b-D-GlcpNAc'

string2 = 'b-D-Glcp-(1->3)-a-L-Fucp'

string3 = '->4)-a-D-Manp'

#%%
mov = SimpleGlycosylateMover()
mov.set_glycosylation(string1)
pos = [4,10]
for ps in pos:
    mov.set_position(ps)
    mov.apply(pose)

mov.set_glycosylation(string3)
pos = [42,45,48]
for ps in pos:
    mov.set_position(ps)
    mov.apply(pose)

mov.set_glycosylation(string2) # cannot finish with mannose addition, throws error
pos = [23,26]
for ps in pos:
    mov.set_position(ps)
    mov.apply(pose)


#%%
scorefxn = get_fa_scorefxn()
kT = 2
n_moves = 1
mc = MonteCarlo(pose, scorefxn, kT)


# Glycosylation mins
mins_glyc = GlycanTreeModeler()


# prepare for hinge region
hinge_reg = [1,52]
hinge = MoveMap()
hinge.set_bb_true_range(hinge_reg[0], hinge_reg[1])

# Movers for hinge region
smallmover = SmallMover(hinge, kT, n_moves)
smallmover.angle_max(10)
shearmover = ShearMover(hinge, kT, n_moves)

# optimese rotamers
to_pack = TaskFactory.create_packer_task(pose)
to_pack.restrict_to_repacking()    # prevents design, packing only
to_pack.or_include_current(True)
packmover = PackRotamersMover(scorefxn, to_pack)

# prepare the sequence mover
seq_mover = SequenceMover()

seq_mover.add_mover(mins_glyc)
num_reps = 10 #  
for num in range(num_reps):
    seq_mover.add_mover(smallmover)
    seq_mover.add_mover(shearmover)

seq_mover.add_mover(packmover)
trial_mover = TrialMover(seq_mover, mc)

file = 'Dummy.log'
header = '# kT = {:0.0f}, n moves/mover = {:d}\n'.format(kT, n_moves) +\
        '# Timestamp, elapsed [s], Cycle, Score [REU], Acc. rate [Acc/trials]\n'


with open(file,'w+') as f:
    f.writelines(header)
    time = timer()

    for num in range(200):
        trial_mover.apply(pose)

        if np.mod(num+1,1) == 0: # conditional used if I want to output less
            E_temp =  scorefxn(pose)
            elapsed = timer()-time
            date = datetime.datetime.now().strftime("%H:%M:%S")
            acc_rate = trial_mover.acceptance_rate()

            string = date + ', {:0.0f}, {:d}'.format(elapsed, num+1) +\
            ', {:0.2f}, {:0.2e}\n'.format(E_temp, acc_rate)
            f.writelines(string)

            if np.mod(num+1,1) == 0: # conditional used if I want to output less
                filename = 'dum_{:03d}.pdb'.format(num+1)
                pose.dump_pdb(filename)

void = mc.recover_low(pose)
filename = 'xDum_lowest.pdb'
pose.dump_pdb(filename)

Note that in this script, the movers are able to change BB angles

File attachments: 
Mon, 2019-01-21 05:54
MNP1986