You are here

Error with recognizing rotamer library atoms

20 posts / 0 new
Last post
Error with recognizing rotamer library atoms
#1

Hi,

I am trying to run enzyme design on a protein with a cofactor which is covalently bound to a residue in the protein, trying to model in a different ligand for the protein. The cofactor is important in ligand binding. I have previously tried using a constraint file on the cofactor, which worked but did not give me acceptable results, so now I am trying to create a non-canonical amino acid. I created a .pdb of the non-canonical amino acid (let's abbreviate it NAA), generated a .mol2 file using openbabel, and generated a .pdb file of the rotamers using FROG2. The rotamers were separated by END lines, which I changed to TER. I reference this .pdb file in the .params file I generate for NAA with this line at the end of the .params file: "PDB_ROTAMERS path/to/file.pdb". When I try to run enzyme design, this is what I see:


core.init: Mini-Rosetta version exported from unknown
core.init: command: /soft/rosetta/rosetta_source/bin/enzyme_design.linuxgccrelease -database /soft/rosetta/rosetta_database -in:file:s input_onur.pdb -in:file:extra_res_fa LG1.params NAA.params -enzdes:detect_design_interface -enzdes:cst_design -enzdes:cut1 6 -enzdes:cut2 8 -enzdes:cut3 10 -enzdes:cut4 12 -cst_min -chi_min -bb_min -ex1 -ex2 -use_input_sc -design_min_cycles 3 -packing:linmem_ig 10 -resfile resfile.res -nstruct 1 -overwrite -out:file:o score.sc
core.init: 'RNG device' seed mode, using '/dev/urandom', seed=-908937193 seed_offset=0 real_seed=-908937193
core.init.random: RandomGenerator:init: Normal mode, seed=-908937193 RG_type=mt19937
core.scoring.etable: Starting energy table calculation
core.scoring.etable: smooth_etable: changing atr/rep split to bottom of energy well
core.scoring.etable: smooth_etable: spline smoothing lj etables (maxdis = 6)
core.scoring.etable: smooth_etable: spline smoothing solvation etables (max_dis = 6)
core.scoring.etable: Finished calculating energy tables.
core.io.database: Database file opened: /soft/rosetta/rosetta_database/pdb_pair_stats_fine
core.io.database: Database file opened: /soft/rosetta/rosetta_database/scoring/score_functions/hbonds/standard_params/HBPoly1D.csv
core.io.database: Database file opened: /soft/rosetta/rosetta_database/scoring/score_functions/hbonds/standard_params/HBFadeIntervals.csv
core.io.database: Database file opened: /soft/rosetta/rosetta_database/scoring/score_functions/hbonds/standard_params/HBEval.csv
core.io.database: Database file opened: /soft/rosetta/rosetta_database/P_AA
core.io.database: Database file opened: /soft/rosetta/rosetta_database/P_AA_n
core.io.database: Database file opened: /soft/rosetta/rosetta_database/P_AA_pp
core.io.database: Database file opened: /soft/rosetta/rosetta_database/Rama_smooth_dyn.dat_ss_6.4
core.scoring.etable: Using alternate parameters: LJ_RADIUS_SOFT in Etable construction.
core.scoring.etable: Starting energy table calculation
core.scoring.etable: smooth_etable: changing atr/rep split to bottom of energy well
core.scoring.etable: smooth_etable: spline smoothing lj etables (maxdis = 6)
core.scoring.etable: smooth_etable: spline smoothing solvation etables (max_dis = 6)
core.scoring.etable: Finished calculating energy tables.
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' N ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' CA ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' C ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' O ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' CB ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' CG ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' CD ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' CE ' in library for NAA
core.scoring.dunbrack.SingleLigandRotamerLibrary: Skipping unrecognized atom ' NZ ' in library for NAA

ERROR: Missing atom positions in rotamer library!
ERROR:: Exit from: src/core/scoring/dunbrack/SingleLigandRotamerLibrary.cc line: 151

These are the atoms that belong to the amino acid (lysine) in standard .pdb format... How can I make Rosetta recognize them?

Thanks in advance!

Post Situation: 
Fri, 2011-11-04 15:23
oerbilgin

I can diagnose this more completely if you post a section of your PDB_ROTAMERS file, but I would guess the atom names are subtly wrong. It's likely to be a spacing thing, like it is "..N." instead of ".N.." (where I have substituted . for an empty space). The spacing looks ok in those error messages, but that's what usually causes it, so I feel I should suggest it.

It may be worth trying a different mol2->pdb format converter. (and try END for TER, unless the instructions said otherwise. I assume you've seen the set of instructions Doug wrote? There's a copy in the tutorials thing here, I think (design_ncaa): http://www.rosettacommons.org/content/rosetta3-tutorials-beta-release)

Fri, 2011-11-04 17:05
smlewis

TER versus END shouldn't make much difference - the way the reader is written, almost anything besides an ATOM or HETATM record will function as a separator.

I agree with Steven that it's the atom naming that is the most likely culprit. He makes a good point that spacing is a likely issue. The other thing to look out for is to check that the names of the atoms in the PDB_ROTAMERS file match those names present in the params file - which might not match those of your starting structure if you used a method (e.g. molfile_to_params.py) which renamed them. (Note that the atom name in the params file follows the ATOM tag after a single space - so in the params file it's "ATOM.XXXX", where the . is a space and the four X's are the exact string used as the atom name, including any spaces.)

Sun, 2011-11-06 12:48
rmoretti

Hi, thank you for your responses!

The problem was indeed the naming as well as the spacing. When I generated the rotamer library using FROG2, it changed the atom labeling to C1, C2, N1, N2 etc, from CA, C, N, NZ etc which was one of the problems before. The only way Rosetta would work was if I changed all of the labeling (in the input PDB, rotamers, etc) to C1, C2, N1, N2 etc. Rosetta gave me the same error as before if I changed all the labeling in the rotamer library to CA, C, N, NZ etc. Anyways, when I finally got it to run, I checked the result in pymol, and the non-canonical residue was missing from the structure. I went into the output .pdb file, and it was absent. I had it substituted as residue 256 in the input .pdb, and the Rosetta output file had residue 255, then 257. None of the scores in the output .pdb had my residue either. When I open the input .pdb file in pymol, pymol changes all the coordinates of my non-canonical AA to 400-something (as opposed to 30-something), resulting in what looks like an atom explosion in the viewer. I've been messing around with atom labeling and numbering in all the files, but to no avail.

The only warning I get in a successful run of Rosetta is this:

core.conformation.Conformation: [ WARNING ] missing heavyatom: OXT on residue SER_p:CtermProteinFull 258

But that serine is nowhere near my non-canonical amino acid- it is the last amino acid in the sequence.

Also, a tutorial for this does not exist in that link Steven sent me- it has a blank readme. I have been using the information in the tarball here: http://www.rosettacommons.org/content/some-application-troubles-when-using-resfile-flag#comment-2626.

Thanks

Tue, 2011-11-08 18:27
oerbilgin

"Also, a tutorial for this does not exist in that link Steven sent me- it has a blank readme. I have been using the information in the tarball here: http://www.rosettacommons.org/content/some-application-troubles-when-usi.... " - it's out of date, try the .bz2 tarball in the same directory. We still (ugh) haven't "officially" released these, so there's a bunch of old copies floating around...

As far as atom names: I strongly recommend forcing at least your backbone atoms to be CA, C, N, O in the params and pdb_rotamers file. Rosetta is supposed to be independent of assumptions, but there are likely to be some sticky spots where backbone connectivity will screw up if the "normal" atoms have abnormal names. For sidechain atoms it shouldn't matter much. Once peptoids come online this will go away, but the peptoids aren't in main Rosetta yet...

If the noncanonical residue is missing entirely from the structure, but there is no error message about it, then the scenario that comes to mind is A) you have -ignore_unrecognized_res on, and B) you do not have the new params file active via -extra_res_fa. If that's the case, turn off the first flag and turn on the second (with the path to your params file). This seems unlikely given your earlier command line, but it's worth asking. If this is not the case, try posting the following things for me to debug with: A) the region of the input PDB with your noncanonical (NC, plus one residue on either side?), B) your params file, C) the first two rotamers worth of your pdb_rotamers file, and D) the full command line.

The missing OXT warning is par for the course with Rosetta, don't worry about it. Rosetta is adding the extra O atom for the unpolymerized C-terminus and telling you about it.

Wed, 2011-11-09 07:16
smlewis

Steven, thanks for your suggestions. I tried forcing the labeling of the backbone atoms to CA, C, N, O, but now Rosetta exits out with this error:

ERROR: stop > start
ERROR:: Exit from: src/protocols/ligand_docking/LigandBaseProtocol.cc line: 854

And with the following block of warnings:

core.io.pdb.file_data: Adding undetected upper terminus type! 237
core.io.pdb.file_data: Adding undetected lower terminus type! 239
core.conformation.Residue: WARNING: Residue connection id changed when creating a new residue at seqpos 239
core.conformation.Residue: WARNING: ResConnID info stored on residue 240 is now out of date!
core.conformation.Conformation: [ WARNING ] missing heavyatom: OXT on residue ALA_p:CtermProteinFull 237
core.conformation.Conformation: [ WARNING ] missing heavyatom: OXT on residue SER_p:CtermProteinFull 259

My non-canonical AA is residue 238 (in the .pdb it is 256, but since the .pdb starts at res 19, rosetta numbers it at 238). It appears that for some reason Rosetta doesn't like connecting the NCAA to the rest of the polypeptide. The coordinates for the NCAA are the exact same as in the .pdb of the solved structure for the protein, so I am confused as to why this is happening.

Attached is the information you requested:
A) PDB of NCAA + one residue before and one after
note: C1-4 of KRT are the side chain carbons of the lysine. I have tried running Rosetta with these as CB, CG, CD, CE in all the files (input .pdb, .params, rotamers) and got the same errors I describe above.
B) .params file
C) Input to FROG2, and the first two rotamers
D) the full command line is:

/soft/rosetta/rosetta_source/bin/enzyme_design.linuxgccrelease -database /soft/rosetta/rosetta_database -in:file:s input_onur4.pdb -in:file:extra_res_fa LG1.fa.params KRT_params4.fa.params -enzdes:detect_design_interface -enzdes:cst_design -enzdes:cut1 6 -enzdes:cut2 8 -enzdes:cut3 10 -enzdes:cut4 12 -cst_min -chi_min -bb_min -ex1 -ex2 -use_input_sc -design_min_cycles 3 -packing:linmem_ig 10 -nstruct 1 -overwrite -out:file:o score.sc

Thank you so much for your time and effort =)

Wed, 2011-11-09 18:30
oerbilgin

I'm guessing your problem is that the params file for your NCAA is of "TYPE LIGAND". I haven't read the NCAA tutorials that Steven linked to earlier, but I assume there's some discussion of how to make your NCAA be of "TYPE POLYMER"

What believe is happening is that the enzyme design code is failing while finding regions for flexible backbone movement. Because the NCAA is of type ligand, Rosetta isn't seeing the section of backbone of which it is a part as a continuous chain. It therefore breaks the backbone, and is dying because it looks like you have one residue long "polypeptides" (not the NCAA, but the residues on either side).

Things may work if you turn off backbone flexibility, or if you add additional residue to your input PDB, but your better bet is to figure out how to convert your params file to type polymer.

Thu, 2011-11-10 08:46
rmoretti

The major problem is that you have a small-molecule ligand, not a polymer ligand. In other words, the params file is not set up for your noncanonical residue to be part of the polypeptide chain, so Rosetta is trying to insert it with no chemical bonds to other atoms. The tarball of instructions from Doug should have included a version of the molfile_to_params.py file that works for _polymer_ residues; this is the version you need to use. Specifically, the polymer residue params file needs to define UPPER_CONNECT and LOWER_CONNECT for the connections to the upstream and downstream N and C atoms.

Secondarily, I think there's a lot of extra data in your rotamers file - Rosetta wants only the ATOM lines with TER cards between them; none of the CONECT lines, etc. Also your first rotamer has no hydrogens.

I have attached a params and partial PDB_Rotamers file for a system I've used so you can see what the params and pdb_rotamers file ought to look like. (this is from the publication Nat Chem Biol. 2011 Jun 12;7(7):437-44. doi: 10.1038/nchembio.585.
A biosensor generated via high-throughput screening quantifies cell edge Src dynamics.
Gulyani A, Vitriol E, Allen R, Wu J, Gremyachinskiy D, Lewis S, Dewar B, Graves LM, Kay BK, Kuhlman B, Elston T, Hahn KM.)

Thu, 2011-11-10 08:48
smlewis

Thanks for your help-- that makes a lot of sense.

OK but now my problem is that I can't get the molfile_to_params_polymer.py to run correctly. I am using the material in the tarball that you said was out of date (so maybe that is the problem?)... I couldn't find the .bz2 tarball in that directory, and there is no tutorial for NCAA enzdes in the RosettaCon folder.

This is what I see:

onur@XXXX:~/Data/MAGIC/Rosetta/enzdes/Onur/redo2> /home/oerbilgin/Data/MAGIC/Rosetta/enzdes/for_ash/molfile_to_params_polymer.py -n KRT_params5 --polymer KRT_mol2_5.mol2 Traceback (most recent call last):
File "/home/oerbilgin/Data/MAGIC/Rosetta/enzdes/for_ash/molfile_to_params_polymer.py", line 1995, in
sys.exit(main(sys.argv[1:]))
File "/home/oerbilgin/Data/MAGIC/Rosetta/enzdes/for_ash/molfile_to_params_polymer.py", line 1913, in main
molfiles = read_tripos_mol2(infile)
File "/home/oerbilgin/Data/MAGIC/Rosetta/enzdes/for_ash/python/rosetta_py/io/mdl_molfile.py", line 305, in read_tripos_mol2
ret = read_tripos_mol2(f, do_find_rings)
File "/home/oerbilgin/Data/MAGIC/Rosetta/enzdes/for_ash/python/rosetta_py/io/mdl_molfile.py", line 371, in read_tripos_mol2
assert len(f) >= 4, "Missing fields on line %i" % line_num[0]
AssertionError: Missing fields on line 148

Where on line 148 is:

M ROOT 1

I have attached my .mol2 file I am using, maybe the formatting in there is wrong? It definitely looks different than the example files in the tarball, but when I change the formatting to make it look similar in various ways, I get a different errors:

UnboundLocalError: local variable 'atom_indices' referenced before assignment


UnboundLocalError: local variable 'molfile' referenced before assignment


ValueError: invalid literal for int() with base 10: '38.1340'

I've also tried just changing my original .params file to have TYPE POLYMER and LOWER_CONNECT and UPPER_CONNECT values, but not surprisingly, that didn't work either.

Again, thanks for your time and patience with me.

Thu, 2011-11-10 12:01
oerbilgin

As far as I'm aware, the lines starting with M aren't valid for .mol2 format files (they're only present in .mol files).

If you wanted to use the M lines with a .mol2 file under the standard molfile_to_params.py, you could use the flag --m-ctrl, but that's a relatively recent addition and I don't know if your molfile_to_params_polymer.py has that option (run it with just the -h flag to find out).

One workaround is to use your favorite molecular editor/file conversion utility (such as OpenBabel) to convert the .mol2 file (again, without the M lines) to a regular .mol file, and then add your M lines to that file. One drawback is that the conversion would drop all the partial charge information that's present in the .mol2 file, and substitute different charges. If that matters to you, you can always manually edit the params file once generated such that the atoms have whatever charges you want.

Thu, 2011-11-10 19:00
rmoretti

I defer to Rocco on the mol file issue. As far as the tarball - can you see
http://rosettadesign.med.unc.edu/collaborators/RosettaCon2011/RosettaCon... ?

Fri, 2011-11-11 09:10
smlewis

Thanks both of you for responding. The M line problem was indeed because of the mol2 vs mol format. I use OpenBabel to make a new .mol file from my original .pdb, and got the molfile_to_params_polymyer.py script (from the new tarball that Steven sent me- thanks) to run, but then ran into trouble again. After some troubleshooting, I think I have it narrowed down to a couple lines in the .params file.

I initially got a segmentation fault after core.scoring.etable finished calculating energy tables. When I went into the .params file, comparing it to the example in Doug's tutorial showed me that the UPPER_CONNECT and LOWER_CONNECT lines looked goofy:

LOWER_CONNECT1N
UPPER_CONNECT1C

To comply with the other AA .params files, and Doug's example, I changed these to:

LOWER_CONNECT N
UPPER_CONNECT C

Which allowed me to get through the segmentation fault, but then resulted in this error:

core.scoring.etable: Finished calculating energy tables.

ERROR: set_atom_base: atoms dont exist!
ERROR:: Exit from: src/core/chemical/ResidueType.cc line: 553

At this point I am not even trying to call in the rotamer library. I've attached my .params file.

Thanks a lot for all your help!

Mon, 2011-11-14 18:25
oerbilgin

A) What version of Rosetta are you using again? The error "set_atom_base: atoms dont exist" is on line 754, not line 553.

B) I assume it's in the ICOOR_INTERNAL lines at the end of the file. Those lines are occasionally frame-shifted, which *might* be the problem, but it looks ok relative to my own params file, so maybe not. The error means that it doesn't recognize the atoms on the right three columns when creating that row (I think it's the first of the right three columns). It could be that they are out of order somehow, so that an atom is declared to depend on another atom that doesn't exist yet? (Of course, the atoms should exist because they're in the atom list in the top of the file, so maybe that isn't possible). The backbone atoms must be special-cased or something so they can depend on themselves, but maybe the other atoms are out of order somehow? Does that make sense?

C) It could be that there are too many CHI lines - you'll note the 10 and 11 lines change the spacing; try commenting out 10 and 11 and see what happens.

Tue, 2011-11-15 09:10
smlewis

Did you realize your params file has two atoms named "H", one of type HNbb and one of type Hpol? Also, in your ICOOR_INTERNAL lines, for one of the H's you have the C atom listed twice, once for the angle atom and once for the dihedral atom.

I'm not sure if it's the cause of your current problem, but even if not, it will likely cause grief later on.

Tue, 2011-11-15 11:06
rmoretti

Hi, this is Doug. Steven asked if I would take a look at this thread to see if I had any ideas. The molfile_to_params_polymer.py script can be finicky. It requires additional information at the end of the mol file in order to properly create a polymer params file, more than just the "M ROOT X" when you were running the non-polymer molfile_to_params.py and making a ligand params file. Take a look at Step 4 in the README file for more info. The full path is below.

rosetta-3.3/protocol_capture/using_ncaas_protein_peptide_interface_design/HowToMakeResidueTypeParamFiles/README

I looked at the params file in your last post. It has some backbone dihedrals being listed as chi angles (CHI 1 N CA C O ) which suggests you have not properly set the backbone atoms in the molfile that gets passed to the molfile_to_params_polymer.py script (most likely the backbone carbonyl oxygen but I would double check all of them).

On the issues of molfiles above, the molfile_to_params scripts needs molfile 2000 format file but the molfile version 3000 bond types (this was not my doing).

Tue, 2011-11-15 12:27
renfrew

Hi Doug, thanks for looking at this thread. I had not realized that I needed to use a special version of babel to make this hybrid molfile. I was using an older version of Rosetta (one that was already on our lab computers), and it didn't have the more detailed readme for the NCAA stuff (I was going off of some older posts in RosettaCommons).

I have a question about this special babel: does it only generate the hybrid molfile from a gaussian input, or will it make the hybrid .mol from any input? To make my NCAA, I basically copied it out of the .pdb (it is a lysine covalently bound to a cofactor), so I don't think I need to run any optimization on it. Therefore I will be trying to make the hybrid .mol from a .pdb file. If so, I'll give that a go and keep my fingers crossed!

I am pretty sure that I set the backbone atoms in the initial molfile correctly, so perhaps the incompatible file type is causing this issue?

Regarding Steven's reply:
I played around with the spacing (the lines with UPPER and LOWER in them had some shifting) to make it look identical to the .params file of lysine (since the NCAA is a lysine plus cofactor), and played around with the order of ICOOR lines, which did not help anything. The order of the atoms in the right three columns was different in my .params than that of lysine, so I changed that around too, which worked fine for the one with UPPER, but when I did this for LOWER (moving it from the lefthand column to the righthand column) I got a segmentation fault. Also, playing with the CHI lines did not help either. Thanks for your suggestions though, let's see what happens when I use the proper molfile format.

Regarding Rocco's response:
I did notice those things, and I don't really understand what's going on with the H's, but I did change that one ICOOR line to actually make sense, but that didn't help my current problem. I'll keep that in mind though, should my new params file have something funky going on like that.

Tue, 2011-11-15 23:14
oerbilgin

Some stuff between Doug and the user off-forum; read emails in reverse order

(second email)
Hi Onur,

Sorry it took me a while to get back to you. I have been busy before the holiday. I found your bug. The KRTmol3.mol is fine except you had the ROOT atom listed in the POLY_IGNORE.

Yours...
M ROOT 34
M POLY_N_BB 34
M POLY_CA_BB 35
M POLY_C_BB 36
M POLY_O_BB 37
M POLY_IGNORE 1 2 4 5 6 7 8 9 10 11 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
M POLY_UPPER 12
M POLY_LOWER 3
M POLY_CHG 0
M POLY_PROPERTIES PROTEIN
M END

what it should be...
M ROOT 34
M POLY_N_BB 34
M POLY_CA_BB 35
M POLY_C_BB 36
M POLY_O_BB 37
M POLY_IGNORE 1 2 4 5 6 7 8 9 10 11 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
M POLY_UPPER 12
M POLY_LOWER 3
M POLY_CHG 0
M POLY_PROPERTIES PROTEIN
M END

Making that change and running the command below should work without error.
python using_ncaas_protein_peptide_interface_design/HowToMakeResidueTypeParamFiles/scripts/molfile_to_params_polymer.py --clobber --polymer --no-pdb --name KRT -k KRT.kin KRTmol3.mol

I have attached the modified input and output from the script. I took a quick look at the params and it looks okay but I did not get a chance to test it. The next step for you would be to use the PDB rotamers as Steven described in some of the previous comments in the post.

I would be great if you could add something to the forums about how it got resolved so that others can follow it.

Thanks,
Doug

(first email)
Hi Doug,

I used the babel that generates the hybrid molfile, but when I try to run the molfile_to_params_polymer.py script, I get an assertion error. When I run it with assertions off ($pymol -O /path/molfile_to_params_polymer.py -blahblahblah), I get an IndexError saying the list index is out of range. What am I doing wrong? I have two versions of this molfile: one with the hydrogens associated with each residue clustered with that residue (as you have in your demo), and one with all of the hydrogens put at the end. This second molfile works fine with the script, but gives me the same set_atom_base: atoms dont exist! error in rosetta.

I made the starting .pdb file by copying out the section of residues (Ala Lys Tyr), and the covalent ligand, and making the covalent ligand part of the Lys residue, renaming it to KRT. To add the hydrogens, I use babel with the -h flag to make another pdb file. Babel puts the hydrogens at the end, which I then copy and paste with their respective residues or leave at the end. Then I use the special babel to make the hybrid molfile, add the M lines, and run the polymer script.

I attached all the relevant files.
KRTupdownhydrocentered.pdb is my starting pdb that looks exactly like your ornithine example. capping groups with their hydrogens at the beginning, my NCAA at the end. KRTmol3 is the molfile for this pdb.
KRTupdownhydroc.pdb is the other pdb that I don't modify at all. That is, N capping group, NCAA, C capping group, all hydrogens at the end. KRTmol4 is the molfile for that pdb. This latter combination runs in the molfile polymer script, but gives that error when I try to run it in Rosetta.

Thanks for your help,

Onur Erbilgin

Wed, 2011-11-23 11:23
smlewis

Thanks Steven- I was going to post that up when I finally got this thing to run. But, of course, I ran into another problem which maybe someone can help me with- After successfully making the params file, I moved it into the database/chemical/residue_type_sets/fa_standard/residue_types/l-ncaa folder, and called it in the residue_types.txt file (it is not commented out like the other ncaas). When I try to run enzdes, I get this:
ERROR: ERROR: Failed to find amino acid UNK in EnvSmooth::representative_atom_name
ERROR:: Exit from: src/core/scoring/methods/EnvSmoothEnergy.cc line: 423

And in EnvSmoothEnergy.cc, the error is coming from a function which is not supposed to be called for ncaa's. Why is it being called on my ncaa?
Here's the function:

/// @details returns const & to static data members to avoid expense
/// of string allocation and destruction. Do not call this function
/// on non-canonical aas
std::string const &
EnvSmoothEnergy::representative_atom_name( chemical::AA const aa ) const
{
assert( aa >= 1 && aa <= chemical::num_canonical_aas );

static std::string const cbeta_string( "CB" );
static std::string const sgamma_string( "SG" );
static std::string const cgamma_string( "CG" );
static std::string const cdelta_string( "CD" );
static std::string const czeta_string( "CZ" );
static std::string const calpha_string( "CA" );
static std::string const ceps_1_string( "CE1" );
static std::string const cdel_1_string( "CD1" );
static std::string const ceps_2_string( "CE2" );
static std::string const sdelta_string( "SD" );

switch ( aa ) {
case ( chemical::aa_ala ) : return cbeta_string; break;
case ( chemical::aa_cys ) : return sgamma_string; break;
case ( chemical::aa_asp ) : return cgamma_string; break;
case ( chemical::aa_glu ) : return cdelta_string; break;
case ( chemical::aa_phe ) : return czeta_string; break;
case ( chemical::aa_gly ) : return calpha_string; break;
case ( chemical::aa_his ) : return ceps_1_string; break;
case ( chemical::aa_ile ) : return cdel_1_string; break;
case ( chemical::aa_lys ) : return cdelta_string; break;
case ( chemical::aa_leu ) : return cgamma_string; break;
case ( chemical::aa_met ) : return sdelta_string; break;
case ( chemical::aa_asn ) : return cgamma_string; break;
case ( chemical::aa_pro ) : return cgamma_string; break;
case ( chemical::aa_gln ) : return cdelta_string; break;
case ( chemical::aa_arg ) : return czeta_string; break;
case ( chemical::aa_ser ) : return cbeta_string; break;
case ( chemical::aa_thr ) : return cbeta_string; break;
case ( chemical::aa_val ) : return cbeta_string; break;
case ( chemical::aa_trp ) : return ceps_2_string; break;
case ( chemical::aa_tyr ) : return czeta_string; break;
default :
utility_exit_with_message( "ERROR: Failed to find amino acid " + chemical::name_from_aa( aa ) + " in EnvSmooth::representative_atom_name" );
break;
}

// unreachable
return calpha_string;
}

P.S. I'm still using Rosetta 3.2

Thanks

Mon, 2011-11-28 13:37
oerbilgin

It look like that EnvSmoothEnergy::residue_energy() has a guard for rsd.is_protein(), but if your residue isn't a canonical amino acid, but is* described as PROTEIN in your params file, you'll run into the error you saw.

Probably the best solution is to turn off (remove the line for) the env_smooth term in the rosetta_database/scoring/weights/enzde.wts file. (Note that env_smooth isn't in Rosetta's standard score12 weights, although there are some indications that including it gives better results, which is why it's in the enzdes weights.)

Another option is to remove the "PROTEIN" designation from your NCAA's params file, and replace it with the "POLYMER" designation (if you don't already have both). I don't know if that will lead to other issues elsewhere, though.

Tue, 2011-11-29 08:57
rmoretti

Commenting out the env_smooth line in the enzde.wts file did the trick! Thank you all so much for all of your help. Now I will generate the rotamer library as described in Doug's protocol capture, and hopefully that will work too!

Tue, 2011-11-29 11:05
oerbilgin