You are here

Adding noncanonical amino acid (PCA) to Rosetta 3

3 posts / 0 new
Last post
Adding noncanonical amino acid (PCA) to Rosetta 3
#1

Hi everyone, I'm in the process of trying to incorporate a non-canonical amino acid into Rosetta, namely pyroglutamic acid: (or 5-oxo-L-proline)

http://www.rcsb.org/pdb/ligand/ligandsummary.do?hetId=PCA

Following the directions provided in the supporting information of Doug Renfrew's paper "Using Noncanonical Amino Acids in Computational Protein-Peptide Interface Design," I created a modified .MOL file, and got all the way to the part where I am to use the molfile_to_params_polymer script.

However, upon running the script:

python scripts/molfile_to_params_polymer.py --clobber --polymer --no-pdb --name PCA PCA.mol

the following error appears:

ALL TO ALL DIST CA INDEX
[1e+100, 1e+100, 1e+100, 1e+100, 1e+100, 1e+100, 1e+100, 1, 1e+100, 1e+100, 0, 1e+100, 1e+100, 1e+100, 1e+100, 1e+100, 1e+100, 1e+100, 1e+100, 1e+100]
Traceback (most recent call last):
File "scripts/molfile_to_params_polymer.py", line 1995, in <module>
sys.exit(main(sys.argv[1:]))
File "scripts/molfile_to_params_polymer.py", line 1953, in main
polymer_assign_pdb_like_atom_names_to_sidechain( m.atoms, m.bonds, options.peptoid )
File "scripts/molfile_to_params_polymer.py", line 1697, in polymer_assign_pdb_like_atom_names_to_sidechain
a.pdb_greek_dist = greek_alphabet[all_all_dist[ca_index][i]]
TypeError: list indices must be integers, not float

Do you have any idea how I might fix this? Below is the PCA.mol file I tried to convert.

Thanks and sincerely,

Julius

PCA.mol

PCA.xyz
-ISIS- 3D

20 20 0 0 0 0 0 0 0 0 0
2.1208 0.3034 0.0378 C 0 0 0 0 0
3.0440 1.0326 0.2298 O 0 0 0 0 0
1.1554 0.4742 -0.9125 N 0 0 0 0 0
1.0565 1.3779 -1.3192 H 0 0 0 0 0
1.8000 -0.9775 0.7849 C 0 0 0 0 0
1.4814 -0.7003 1.7830 H 0 0 0 0 0
2.6823 -1.5982 0.8701 H 0 0 0 0 0
0.6668 -1.6174 -0.0238 C 0 0 0 0 0
-0.0570 -2.1382 0.5903 H 0 0 0 0 0
1.0627 -2.3243 -0.7440 H 0 0 0 0 0
0.0410 -0.4240 -0.7906 C 0 0 0 0 0
-0.3108 -0.7313 -1.7700 H 0 0 0 0 0
-1.1136 0.1819 0.0098 C 0 0 0 0 0
-0.9173 0.9751 0.8881 O 0 0 0 0 0
-2.3365 -0.2906 -0.3133 N 0 0 0 0 0
-2.4266 -0.8715 -1.1146 H 0 0 0 0 0
-3.5473 0.1611 0.3396 C 0 0 0 0 0
-3.2867 0.5559 1.3088 H 0 0 0 0 0
-4.2305 -0.6697 0.4625 H 0 0 0 0 0
-4.0419 0.9400 -0.2314 H 0 0 0 0 0
3 1 1 0 0 0
1 2 2 0 0 0
1 5 1 0 0 0
4 3 1 0 0 0
3 11 1 0 0 0
8 5 1 0 0 0
5 7 1 0 0 0
5 6 1 0 0 0
11 8 1 0 0 0
10 8 1 0 0 0
8 9 1 0 0 0
12 11 1 0 0 0
11 13 1 0 0 0
15 13 1 0 0 0
13 14 2 0 0 0
16 15 1 0 0 0
15 17 1 0 0 0
20 17 1 0 0 0
17 19 1 0 0 0
17 18 1 0 0 0
M ROOT 3
M POLY_N_BB 3
M POLY_CA_BB 11
M POLY_C_BB 13
M POLY_O_BB 14
M POLY_IGNORE 16 17 18 19 20
M POLY_UPPER 15
M POLY_LOWER 5
M POLY_CHG 0
M POLY_PROPERTIES PROTEIN
M END
$$$$

Post Situation: 
Fri, 2014-08-01 11:23
JuliusSu

Like PDBs, the MOL file format is a column based format, so spacing matters. Unfortunately the forum strips out the spacing details, so posting a mol file in-line with a copy-paste doesn't work to well. You'll want to attach the mol file with the "Add a new file" option - you'll need to rename the file to something ending in .txt first, though.

From what I can tell, though, you have some issue with the bonding for your molecule. You're choking on a location where it tries to find the through-bond separation from each atom to every other atom. (e.g. are the atoms separated by one bond, two bonds, etc.) Your values are the defaults, meaning that that bond separation detection failed.

My initial thought is some formatting issue with the input file. Rosetta is more sensitive to formatting than dedicated molecular editors are. You may want to try passing the mol file though something like OpenBabel to normalize the formatting/representation. (i.e. "convert" from mol to mol format). If you attach the file I could tell better, though.

Mon, 2014-08-04 14:11
rmoretti

Thanks rmoretti, and apologies for the late follow-up.

Doug Renfrew helped me to debug these particular error messages. It turns out that the script assumes a polymeric residue, where there will be a sensible POLY_LOWER and POLY_UPPER. But PCA is a proline-like residue, here used as a N-terminal residue. Thus we can fix the error by changing the POLY_LOWER atom to 4 (the n-terminal hydrogen) and then re-running the script. We then delete the LOWER_CONNECT line and add an ATOM and BOND line for the H.

Fri, 2014-08-08 02:07
JuliusSu