You are here

Folppy Tail and Constraints

16 posts / 0 new
Last post
Folppy Tail and Constraints


I am trying to model the C-terminal tail for a membrane protein which is not resolved in the crystal structure and run MD afterwards. It's 40 residues long, but disordered. I am choosing FloppyTail protocol. Since the protein is sitting in a membrane, Z component of all residues' coordinates should be more than a specific value, such that tail be out of the membrane and fall into the cytoplasmic region completely.

What I am doing so far is to do the sampling starting with the protein attached to a tail which is extended into space,  without any constraint, and then choose those predicted models that have a proper Z values for the tail - which clearly wastes computational time. I also thought of having the membrane as well in my PDB input and then model the tail, maybe I should try that.

It looks more reasonable though to use some constraints. I'm just not sure which constrain type I should choose to constrain Z of all tail residues to be more than let's say 12. I would appreciate any suggestion you might have on which type of constraint I shall choose in this case.

I will attach the options I have used for FloppyTail as well. I'd also appreciate any comments on them.




-in:file:s ./input_files/pro_tail.pdb
-in:file:frag3 ./input_files/robetta/aat000_03_05.200_v1_3

-run:min_type dfpmin_armijo_nonmonotone
-FloppyTail:flexible_start_resnum 187
-FloppyTail:flexible_chain X
-FloppyTail:short_tail:short_tail_off 0
-FloppyTail:short_tail:short_tail_fraction 1.0

-FloppyTail:shear_on .3333

-FloppyTail:perturb_temp 0.8
-FloppyTail:refine_temp 0.8

-FloppyTail:refine_repack_cycles 100
-FloppyTail:perturb_cycles 5000
-FloppyTail:refine_cycles 3000
-nstruct 700

Post Situation: 
Fri, 2017-05-12 12:23

There is a similar question to this somewhere on the message boards - basically wanting to model a membrane as "don't go here" space - but I don't know that we resolved it.  We don't have a way to do partial coordinate constraints.

A) there is a Rosetta membrane mode, but I don't really know anything about it or how it works.  It hasn't been combined with FloppyTail to my knowledge.  The membrane scorefunction has a notion of depth, but I don't know if there's a way to use in as contraints in this sense.

B) I would try building some virtual residues (or even something like magnesium atoms; it won't matter, so long as they stay still) in a plane along your Z-axis boundary, and then build repulsive constraints between your tail residues and those Z-axis definers.  There's no simple way to do this already in Rosetta (that I know of).  If you have a C-terminal tail, put all these added atoms first in the PDB, and they'll stay constant during modeling.  Your constraints would look like:

AtomPair CA tail_i ?? virt_j BOUNDED X 1000 0.5 0.5 tag

where tail_i are the i tail residues, virt_j are the j residues in your membrane barrier thing, ?? is whatever atom they have (virtuals if you can figure it out, but magnesiums or whatever will work), and X is the distance at which you want repulsion to start.  This functional form means 0 penalty between X and 1000 angstroms distance, and a penalty for closer/further.

Making the grid of faux atoms is the hard part.  I don't know what spacing they need - probably 10 angstroms would work?  Probably roughly the distance of repulsion you want anyway?

C) If your protein is actually oriented correctly, and you are ok with programming, you could hack together a scorefunction term that penalizes any atom with a Z coordinate beyond some threshold pretty quickly.

Computer time is usually cheap compared to developer time, so I wouldn't sweat wasting it on models and just filtering after.  (FloppyTail was written under that assumption.)

Fri, 2017-05-12 12:50

Thanks to your help, I have now modeled the flexible tail of the protein by producing 10,000 models, choosing those that don't enter the membrane area and analyzing based on clusters and scores (and partially the general shape). 

Moving forward, I am trying to model the tail having , for example 4 of this proteins next to each other - as I know the complex structure for the monomers. Meaning I would like to model 4 different parts in my structure, using floppy tail. I also know that two out of 4 monomers are identical, and the other two are also identical - imagine two pairs of identical dimers are siting together in a line or something. 

i was wondering if there is a sort of symmetry I can use with FloppyTail, to make sure that the identical monomers will be modeled the same. as far as I could learn (and I would like to confirm it here), symmetrical modeling is not implemented in FloppyTail. I could only find a modification in which might help the problem.

Tue, 2017-08-22 12:46

Jeliazko has made FloppyTail symmetry compatible; it runs in the same way the rest of symmetry-enabled Rosetta runs (  If you try that and it doesn't work I'll loop him in.

Tue, 2017-08-22 15:02

To have a complex with 4 monomers, I am trying to add a fold tree to my flags to run Floppy Tail. it didn't work at the first few tries, so I am trying it only for 1 monomer/chain, just to test. very simple:

I have a file named fold_tree.ft, which includes the line: FOLD_TREE EDGE 1 230 -1  

and the file movemap, RESIDUE 187 230 BBCHI

I have included the following options in my flags:

-in:file:fold_tree ./input_files/fold_tree.ft
-in:file:movemap ./input_files/move_map

the predicted model is irrelavant. resid 1 to 187 haven't moved, the residues 187 to 223 are dispalced (and modeled) and residues 223 to 230 also haven't moved or modeled.

I am looking in my log file, I see these two lines:

non-terminal or N-terminal (C-rooted) floppy section, using a linear fold tree to try to ensure downstream residues follow
Old tree: FOLD_TREE  EDGE 1 192 -1  EDGE 1 193 1  EDGE 193 229 -1 new tree: FOLD_TREE  EDGE 1 192 -1  EDGE 192 193 1  EDGE 193 229 -1

WARNING: The following options have been set, but have not yet been used:
        -in:file:fold_tree ./input_files/fold_tree.ft

obviously rosetta is not reading my fold tree file, meaning (please correct me if I am wrong) it either contradicts or is overwritten by other options, or the the input command is wrong.

But still the predicted model doesn't match the fold tree reported in the log file either. I have checked my residue numbering in the pdb, it starts from 1 and goes all  the way down to 230. And I am tryign to model residues 187 to 230. the rest of the options are the same as the very first message I have left here.



Tue, 2017-08-29 09:51

A) FloppyTail predates there being an in:file:fold_tree option (or at least my knowledge of it) - so it doesn't support it.  I've done a lot of manual hacking over the years where it would surely have been easier to just read custom fold trees from a file...

B) You are telling me you have one monomer from residues 1-230.  Rosetta is saying you have two chains: 1 to 192 and 193 to 229, and no residue 230 at all.  Which is it?  What's going on at 192?  What happened to 230?  Ligands?  Heteroatoms?

I have no explanation for residues 222 and 223 behaving differently in this regime.  Do you mean 223+ haven't moved internally, or in 3-d space?  Are you at short or long cycle counts?  (In other words is it plausible that monte carlo has just not yet sampled here?)

Tue, 2017-08-29 11:03

A) So to clarify, floppy tail - not supporting a custom fold tree, cannot be used to model the tail of 2 different chains/monomers at the same time?   

B) I think I know where the problem is now, I have to look into the bonds and atoms more closely. I will follow up if the problem persists.

Tue, 2017-08-29 11:52

FloppyTail does not support custom fold trees via the -fold_tree command line option.  FloppyTail supports custom fold trees via some combination of carefully setting up input PDBs just-so, and occasionally me writing hardcoded patches with hardcoded fold trees as needed.  I've filed an issue for adding that flag option to FT for a more permanent fix, but I'm more able to offer you a hack if necessary than add the feature in a general sense.  (writing you a hack is provided gratis but also untested, adding that flag to the main codebase requires testing, etc...)  If we get all the other issues ironed out and it becomes clear that we cannot get the fold tree you need from the code as it stands, I will write you a patch.  (You are not the first person for whom I'll have done this).

If you have a bunch of C-terminal tails in a multimer there should be no need for a custom-written fold tree; we might need to comment one line of code out.  That's exactly the case that Jeliazko did recently so it should work out of the box.

Tue, 2017-08-29 12:55

Thanks for your reply. I see your point. so here is the system I am trying to model:

four monomers next to each other in one strand. All four are missing the C-terminal tail known to be completely floppy. The two monomers in the middle have a small domain which is bounded to the last few residues of the tail. The binding conformation between those residues and that domain is known, and I would like to maintain the conformation the same ,therefore I intend to run floppy tail on every residue in the tail up to those last 7, for example. 

They are four completely  separated monomers, they are only sitting close to one another, which limits the available space for the tails.

I have previously modeled the tail alone, and the tail with the attached domain for one monomer.

Tue, 2017-08-29 13:32

"The binding conformation between those residues and that domain is known, "

If you know the take-off and landing points of your floppy region - you have a loop modeling problem and not a floppytail problem.  It ceases to be a tail when you have fixed residues beyond it.  FloppyTail DOES model internal linkers - but that's for giving rigid-body motion between domains connected by the linker (think about it as docking via a tether).  If you want the endpoints of your flexible region to remain fixed in 3-D space, it's a loop modeling problem; if you want the endpoints to MOVE in 3-D space but remain fixed in internal coordinates; use FloppyTail.  (If you want the endpoints to move in both spaces use relax).

Objectively, loop modeling has terrible recovery for very long loops - but subjectively, it will do fine for the sort of "sample me an envelope of possibilities" you are trying to do. 

Tue, 2017-08-29 13:57

Sorry if my earlier explanation was not clear enough; yes, I want the end of the tail to be able to move in 3-D space (along with the attached domain), but not move internally (to maintain the binding conformation with the attached domain). I imagine it can be interpreted as a linker - the end of the tail and the attached domain can have rigid body motion.

Once more, this is the case for the two middle monomers; the first and the last monomers don't have a domain attached to them, so the whole C-terminal tail should be modeled as a floppy tail (move in 3-D space and internally).

So I beileve the best solution is still Floppy Tail. 

Tue, 2017-08-29 16:04

So there are 6 chains total, 4 of type A and 2 of type B.  2 of the A-type have totally floppy C-terminal tails.  2 of A-type have mostly floppy C-terminal tails, but the last few residues are bound to one of the B-type chains.  You want the B-type chains to move rigidly with the tail tip w/r/t the other chains in the pose. 

Yes, under that construction, FloppyTail will do it, and we'll need a custom fold tree.  I can sketch you out what the code looks like and you can fill it in yourself with the actual residue IDs, or you can send me a PDB of your system and I can return a patched

Wed, 2017-08-30 13:54

That's the exact description of my system . I appreciate it, and I think I'd like to give it a try myself.

Thu, 2017-08-31 08:21

I like that answer!

First: We need to define all the interesting points of the system in Rosetta residue numbering (not PDB numbering).  I'll use letters here.  For your domain types, I assume the two free monomers are first (A type), then the two monomers that have a partner (A' type), and then the two partners (B type) are in that order in the PDB (A A A' A' B B) - you can rearrange the PDB to match that, or rearrange my suggestions to match that order.

A: first residue (1)

B: last residue of first A monomer

C: first residue of second A monomer

D: last residues of second A monomer

E: first residue of first A' monomer

F: last residue of first A' monomer

G, H: same for second A'

I, J: same for first B

K, L: same for second B


Now your FoldTree needs to have PEPTIDE edges (that's all caps in the code too) between the start and end of each chain (so A to B, C to D, etc).

You need Jump edges between A and C, A and E, A and F, A and G.  The N-terminus of each of the A chains will be linked and non-mobile.

You need Jump edges between F and I, and H and L.  This ties the C-terminus of an A' to the B partner.

Partial code for this - I've left some of if for you to fill in more add_edge lines, and also you need to fill in all the A B C D etc:

if (true) { //begin sfulad2 hack
  core::kinematics::FoldTreeOP sfulad2Tree( new core::kinematics::FoldTree );
  //add_edge(start, stop, label) //label = -1 for peptide, # for jump
  using core::kinematics::Edge; //Edge::PEPTIDE
  sfulad2Tree->add_edge(A, B, Edge::PEPTIDE); //that's numbers A and B, not domain types A and B
  sfulad2Tree->add_edge(C, D, Edge::PEPTIDE);
  ///same for other PEPTIDE edges - you need to add FOUR more

  //now do the jump edges, note the numbers increase
  sfulad2Tree->add_edge(A, C, 1);
  sfulad2Tree->add_edge(A, E, 2);
  //same for other A to A/' N to N jumps 3 and 4 - you need to add TWO more

  //now do the A' Cterm to B Nterm jumps
  sfulad2Tree->add_edge(F, I, 5);
  sfulad2Tree->add_edge(H, L, 6);


  foldtree_ = sfulad2Tree;
  TR << "sfulad2 FoldTree: " << *foldtree_ << std::endl;


This goes in at around line 354 of - after the COM-rooted fold tree stuff, and before the TaskFactory stuff.


I'd forgotten Jeliazko's COM (center of mass) stuff existed until looking at this, but it suggested to me two other shallower hacks we can also try if you don't want to write this much code.

1) using his COM_root option (-COM_root), set up your PDB so that you have 4 chains, not 6 - just embed the B chains as part of the A' chains in the numbering.  Rosetta will give an awful score to the geometry around the "fake" A->B bond, but since that region is constant anyway the scores will wash out in comparisons between models.  That is likely to work out of the box with a little PDB hacking.

2) Same deal with hacking up the PDB, but instead of using COM_root, just  comment out lines 276-289 of, which auto-detect internal linkers and linearize the FoldTree to compensate.  The default fold tree is likely to work if you hack the PDB in the way I just described.


Thu, 2017-08-31 09:33

I made the changes as you said, but there are two problems;

- based on the log file, rosetta is not reading the hack I guess.

- and the output - whatever it looks like- it starts from the seconbd residue, meaning it discards the first residue of all A/A' monomers.

I couldn't attach the pdb or here take a look at, I will appreciate your help again.

Thu, 2017-09-07 12:48

`  TR << "sfulad2 FoldTree: " << *foldtree_ << std::endl;` will make it crystal clear if the code ran or not.  Did you recompile?


Dumping the first residue of the A/A' chains - Rosetta will tell you why if you read the log file, but probably it's missing a backbone heavy atom.  Are N, C, CA, O defined for those residues, with nonzero occupancy?


Attaching files - you can usually get it to attach arbitrary files if you just change the extension to .txt. so mypdb.pdb.txt or whatever.

Fri, 2017-09-08 09:16