You are here

loop_length_changing

15 posts / 0 new
Last post
loop_length_changing
#1

Dear friends,
I am trying to use "loop_length_changing". The demonstrating file is straightforward.

rosettademos\public\AnchoredDesign\loop_length_changing

Can I ask
1) How to specify the loop file?

The demonstrating loop file uses "LOOP 18 28 21 0 0" to loop ALA 21-25. As I understand, three residues before (i.e. 18-20) and after (i.e. 26-28) are also included in the looping region. The cutpoint 21 should refer to the actual loop starting point. Can I ask if I understand correctly? I am still quite confused about the meaning of "cutpoint" (the 3rd number).

In my case, I want to loop a region of 136-139. So should I use "LOOP 133 142 136 0 0"?

2) Should I include the option "-nstruct"? If so, how many should be sufficient, like 1000 ?

3) Can I specify the chain identity? I have heavy chain (HC) and light chain (LC), and now I have to do it separately.

Thank you very much.

Yours sincerely
Cheng

Post Situation: 
Tue, 2014-10-14 04:10
lanselibai

Nstruct should be 1 for this sort of work. The work you are performing here is just trying to get the loop length changed, but not find a great loop conformation - so nstruct 1 is fine. Once the loop is the correct length, you'd do your modeling "for real" with larger nstruct, longer cycle counts, etc.

A cutpoint is the center position for CCD loop modeling. Basically CCD has the loop be kinematically discontinuous with a break in the middle and tries to realign the residues on either side of the cutpoint to find a new loop structure. It's irrelevant for the problem at hand - any position in the middle of the loop will work.

The demonstration file uses LOOP 18 28 21 0 0 for the loop 18 to 28. The inserted alanines are 22-25, but the loop itself is larger and is residues 18-28. For internal loops you can't just add residues without flexibility on each side to accommodate them.

Loop files are in "Rosetta numbering" - all residues numbered from 1, in the order they appear in the PDB. There *may* be a way to include chain data in the loop file with a character before/after the number (like, LOOP 18a 28a 21a 0a 0a, or maybe as a18 a28 etc etc), but I don't think it's ever been fixed. There's a flag (I think -renumber) that can be used with the simple applications that outputs the PDB renumbered from one (try score.cc or score_jd2.cc).

Tue, 2014-10-14 08:59
smlewis

Hi smlewis,
Thank you very much.

When you say "you'd do your modeling "for real" with larger nstruct, longer cycle counts", do you mean I should use "relax.linuxgccrelease" for the post-looping model?

Also many thanks for your explanation about the cutpoint, loop covering region and chain identity.

Yours sincerely
Cheng

Tue, 2014-10-14 14:50
lanselibai

Hi friends,
After running:

~/Cheng/rosetta_2014.30.57114_bundle/main/source/bin/loopmodel.linuxgccrelease @/mnt/hgfs/Mutagenesis_Rosetta/4KMT/4.0_loop_modeling/loop.options -database ~/Cheng/rosetta_2014.30.57114_bundle/main/database &> /mnt/hgfs/Mutagenesis_Rosetta/4KMT/4.0_loop_modeling/loop.log &

I was told in the terminal:

[1]+ Segmentation fault (core dumped)

The log file is attached.

I am not sure whether the problem is due to the loop file or the options file. Based on the secondary structure of 4KMT, the loops cover 131-140 and 217-228 though the unresolved regions are 136-139 and 223-228.

Could you please help me. Thank you very much.

Yours sincerely
Cheng

) The following is my loop file:
LOOP 131 140 136 0 0
LOOP 217 228 223 0 0

) The following is my options file:
-database /home/Cheng/rosetta_2014.30.57114_bundle/main/database
-in::file::s /mnt/hgfs/Mutagenesis_Rosetta/4KMT/4.0_loop_modeling/4KMT_HC_native_442residue_FullAtom.pdb
-ex1
-ex2
-ndruns 10
-in::file::fullatom
-out::file::fullatom
-loops::loop_file /mnt/hgfs/Mutagenesis_Rosetta/4KMT/4.0_loop_modeling/4KMT_HC.loops
-loops::build_initial
-loops::remodel perturb_kic
-loops::intermedrelax no
-loops::refine refine_kic
-loops::relax no
-loops::frag_sizes 9 3
-loops::frag_files /mnt/hgfs/Mutagenesis_Rosetta/4KMT/4.0_loop_modeling/aa4KMT_HC09_05.200_v1_3 /mnt/hgfs/Mutagenesis_Rosetta/4KMT/4.0_loop_modeling/aa4KMT_HC03_05.200_v1_3

File attachments: 
Tue, 2014-10-14 15:49
lanselibai

Is the Loop at the end of the protein? You can also check out RosettaRemodel for building regions into your protein. I believe it should work for terminal residues.

https://www.rosettacommons.org/docs/latest/rosettaremodel.html

https://www.rosettacommons.org/docs/latest/Remodel.html

Wed, 2014-10-15 08:52
jadolfbr

Hi jadolfbr,
Thank you. Yes, the loop file contains two loops, one of which is a tail at the end.

It works fine if only the internal loop is included. So the problem of "Segmentation fault" should due to the tail loop.

It seems the "Remodel" contains lots of information. I will take a look at it in detail.

Yours sincerely
Cheng

Wed, 2014-10-15 15:06
lanselibai

Hi jadolfbr,
I just tried remodel.linuxgccrelease and failed to model the tail. It worked fine if the loop is an internal loop.

The failure prompt I got for the tail remodeling is:

terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr

# My command line is:
~/Cheng/rosetta_2014.30.57114_bundle/main/source/bin/remodel.linuxgccrelease -database ~/Cheng/rosetta_2014.30.57114_bundle/main/database -s /mnt/hgfs/Cheng/remodel/4KMT_HC_native_442residue_FullAtom_looped.pdb -remodel:blueprint /mnt/hgfs/Cheng/remodel/blueprint.4KMT_223_228.domaininsertion -remodel:quick_and_dirty -run:chain H -remodel:num_trajectory 3 -overwrite > remodel.log &

# My blueprint (attached) is
217 V L PIKAA V
218 E L PIKAA E
219 P L PIKAA P
220 K L PIKAA K
221 S L PIKAA S
222 C L PIKAA C
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H

# The log file is attached as well.

Probably "remodel.linuxgccrelease" is not suitable for tail modeling. It seems that I have to use the inefficient FloppyTail.

Yours sincerely
Cheng

Thu, 2014-10-16 07:22
lanselibai

Your assignment is "roughly" correct. 2 issues here:
1) I am not sure if you only listed a small section of the blueprint or if that's everything in it. Normally the initial blueprint (use the script accompanying Rosetta or your own) matches your input PDB length, and then you edit it to increase or shrink the length. I think you only input this small chunk as the blueprint, and that confused it because your blueprint can be interpreted as: throw away all the residues except 217-222, and rebuilt the entire region (because missing lines in blueprint correspond to deletion). So the solution is simply putting the rest of the residues back in the blueprint

2) if you really want to throw away everything and build a de novo structure of 12 residues long, you are pretty close. In this case, unfortunately it still have to retain some stubs from input structure to start a model (can't start from vacuum). if that's the goal, you have to make it look something like this:
217 V L PIKAA V
218 E . PIKAA E
219 P L PIKAA P
220 K L PIKAA K
221 S L PIKAA S
222 C L PIKAA C
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H

in this case 218 is the stub to start the model. extend it towards N-term with one residue, and extend it towards C with 10, everything except 218 is built from scratch. Normally this is not how you'd start a de novo structure, but you potentially can...

Fri, 2014-10-17 11:34
possu

Hi Possu,
Thank you very much. It worked as you said if I include the rest of the residues. However, the structure has been perturbed significantly. Can I ask:

1) Is there a flag to minimise the perturbation?

2) What is the recommendation number for "-remodel:num_trajectory"?

Also thank you for your Issue 2 but it was not in my case.

Yours sincerely
Cheng

Mon, 2014-10-20 01:08
lanselibai

You must have also rebuilt the rest of the structure. you used -remodel:quick_and_dirty, so it shouldn't actually touch anything outside of the assigned 12 residues.
are you sure the other positions have the third column as a "." ? You also don't need the PIKAA assignments if they are assigned as "." <= the dot will leave the backbone alone. you can change the amino acid with PIKAA, but if it is to stay the same, you don't need an assignment.

If that didn't solve the problem, it might be that you have different chains.
The blueprint does not specify chains and assume that the target is single chain. you can still run it over multiple chains (but the blueprint has to have both chains, if I remember correctly), but the geometry between the chains are treated as a fake bond, so that's stationary. As such, you can only grow the tail on the last chain. If you want to grow the tail of the first chain, you have to first swap the coordinates to make it the last chain. There's an undocumented/untested way to handle it, but since it's not yet tested, you probably wouldn't want to use that at the moment.

This is one major limitation of the simple blueprint. We had planned ways to handle it differently, but it is not a priority.

for (2). The num_trajectory really depends on the problem you are dealing with. Your sample blueprint builds a his-tag, and I don't necessarily think that Remodel can give you a good solution -- because it is an un-constrained problem, there's no way to differentiate a good free his-tag from another free histag. It can build a model for you so if you are trying to judge the size of a linker to use, it might be useful. But I don't think you can "predict" a his-tag conformation using fragment insertion alone. So in those cases, I wouldn't waste time to sample too much. but if you have a well-defined problem with specific interactions to capture, I would use at least 1000~2000 trajectories to get a sense of what the peptide chain wants to do. (and remember it is coupled to the -save_top flag).

Mon, 2014-10-20 17:06
possu

Hi possu,
Thanks a lot for your detailed help. I am sorry but I did not realise that I need to use "." for all the existing residues.

As you said, I have tried:

1 E .
2 V .
3 Q .
4 L .
5 V .
(...omit...)
219 P .
220 K .
221 S .
222 C .
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H

and

1 E . NATRO
2 V . NATRO
3 Q . NATRO
4 L . NATRO
(...omit...)
219 P . NATRO
220 K . NATRO
221 S . NATRO
222 C . NATRO
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H
0 x L PIKAA H

However, the prompt indicated:

ERROR: seqpos >= 1

The log file is attached.

For the chain identity, I only want to model the tail of one chain. I have HC (heavy chain) and LC (light chain). So I use a text editor to divide the PDB into HC_PDB and LC_pdb, which is followed by concatenating them into one PDB file after remodel (or any other protocol that cannot differentiate chain identity).

Also thank you for the trajectory suggestion. Yes, I think I do not need to spend too much time on this His tag as I will always replace them in the subsequent homology modeling.

Yours sincerely
Cheng

File attachments: 
Tue, 2014-10-21 15:38
lanselibai

what you are doing in the first file is correct. But here your problem is that the extension "0 x L" need to know where to connect to the starting structure, so you actually need to assign "222 C ." to be "222 C L NATRO" or "222 C L PIKAA C"

Then it should work just fine. again it's because blueprints can handle deletion, so you have to specify the flanking regions.

Fri, 2014-10-24 16:09
possu

Hi possu,
Thank you very much. As you said, it finally works 1) without any pertubation within the rest of the structure, and 2) add a tail to the structure.

Probably an example can be included in the demo file and website.

Yours sincerely
Cheng

Sat, 2014-10-25 14:20
lanselibai

Hi possu,
I just realise that even though "NATRO" or "NATAA" is assigned to the flanking region residues, those residues are likely changed to some other residues. So I have to use "PIKAA" to prevent them become other amino acids. I am okay to use "PIKAA" anyway. I am just curious why the residues are due to change when using "NATRO" or "NATAA"?

As I understand, both "NATRO" or "NATAA" should "use the native amino acid residue". So the "native" here means the original amino acid in the PDB structure, or any amino acid in the 20 amino acid?

Thank you.

Yours sincerely
Cheng

Fri, 2014-11-07 02:06
lanselibai

Dear friends,
As I tried "loop_length_changing" by comparing the output PDB to the starting PDB, they have some displacement (or shift) in terms of coordinates. Is there someway to prevent this coordinate shift? I am not sure if the loop region is the only region that has been modeled.

Also, it seems that "loop_length_changing" cannot be used to model a tail (region at the end of the chain). Is this correct?

Thank you very much.

Yours sincerely
Cheng

Mon, 2014-10-20 01:40
lanselibai