You are here

Docking an iron atom to a peptide

13 posts / 0 new
Last post
Docking an iron atom to a peptide
#1

Hi All,

This is my first post to the form. I'm a new user to Rosetta.

I am using the ligand_docking application to dock an Fe(III) atom to a protein structure, and am running into difficulty. A few questions:

1. Is ligand_docking appropriate for docking a single atom to a protein molecule?

2. If ligand_docking is appropriate, what is the best way to do it? When I try to use the 1_setup.sh script produced by arls.py to set up a ligand docking run, it complains that I need to have at least 3 atoms in the Mol2 file used to define the ligand (the Fe atom).

Thanks! and all the best,
--Buz

Post Situation: 
Wed, 2011-01-26 09:49
buzb

A) probably not. Metal atoms dock to proteins based on sensitive electronic structure (in the metal atom); Rosetta does not represent those details. Biomolecules and small organics dock based on van der Waals and hydrogen bonding, which we can generalize and do calculate. I know the default force field can't do this; you may be able to poke around the available scorefunction terms and find one that can, but I don't think we have any designed for the purpose. Maybe look into the metalloprotein_abrelax code (there should be a demo in test/integration/tests/metalloprotein_abrelax)?

B) You need virtual atoms to get monatomic metal ligands to work. It's hard rule that residues need 3 atoms because Rosetta functions on internal coordinates, and you can't define the coordinate frames without enough atoms. Iron is probably already in the database. Check out rosetta_database/chemical/residue_type_sets/fa_standard/residue_types/FE.params. I typed that from memory so you may need to poke around a little to find it.

C) Do you know where the ion goes? I would place it there with constraints. That's basically what I've done in the past for a project involving metal binding sites.

Wed, 2011-01-26 10:33
smlewis

Hi smlewis,

Thanks for your speedy response!

I was just reading Chu Wang's 2010 paper from Protein Science, and this looked like a good starting point for designing metal binding proteins. I have a template sequence that I know binds iron, that I want to use as a design template. I have a good idea of where in this starting sequence the Fe atom should bind (4 cysteine residues).

I was just looking through the metalloprotein_abrelax demo, and this looks almost perfect. How can I substitute the Zn atom in the demo for an Fe(III) atom?

Thanks! and all the best,

--Buz

Wed, 2011-01-26 11:33
buzb

I would guess you can try just replacing the zinc atom in the pdb with an iron atom. Be very careful with the spacing, atom name " FE " is not the same as "FE " or " FE". The params file for the iron atom will tell you what the spacing ought to look like.

I forwarded your other question along to someone who knows something about fragments generation, (because I sure don't.)

Wed, 2011-01-26 12:53
smlewis

Hi smlewis,

Thanks for getting back to me! My response is broken up into two pieces, which may seem disjointed. But here goes anyway!

Firstly, I am trying to modify the metalloprotein_abrelax demo to use my peptide sequence, rather than the protein sequence included in the original demo. Thus far, I'm not too worried if the predicted structure is bound to Zn or Fe (I'll worry about this later!). Here is my sequence:

DTHFPICIFC[CYZ]C[CYZ]GCCHRSKC[CYZ]GMC[CYZ]CKTZ[ZN]

I replaced the fragment files included with the demo with fragment files that I generated using the Robetta fragment server. These fragment files were generated using the non-metal bound primary sequence:

DTHFPICIFCCGCCHRSKCGMCCKTZ

I also replaced the .psipred_ss2 file in the demo with a .psipred_ss2 file produced by the Robetta fragment server. I also modified the flags file to point to these files.

I ran the AbinitioRelax application using this flag file. However, the code crashed with the warning and error:

core.pack.task: Packer task: initialize from command line()
Warning: Unable to locate database file Dunbrack02.lib.bin
ERROR: expected to read 18 libraries from Dun02, but read 10
ERROR:: Exit from: src/core/scoring/dunbrack/RotamerLibrary.cc line: 847

Could you tell me what I'm doing wrong? Why does the code expect to read 18 libraries rather than 10? Is this because of my choice of fragment libraries?

Thanks! and all the best,

--Buz

Wed, 2011-01-26 17:07
buzb

You should be getting 18 libraries out of bbdep02.May.sortlib. There are 18 canonical residue types (CDEFHIKLMNPQRSTVWY) that have Dunbrack libraries. Alanine and cysteine don't have rotamers, thus, no library. Something is horribly wrong if you're getting 10.

I've never seen the error before but I suspect the file is corrupted. Look at bbdep02.May.sortlib, it should have this md5 and size in bytes:
> md5sum bbdep02.May.sortlib
ff356363d20f99fcbcb98b173fb316f8 bbdep02.May.sortlib

>ls -l bbdep02.May.sortlib
-rw-r--r-- 1 smlewis smlewis 50417532 2008-05-02 10:31 bbdep02.May.sortlib

Also try deleting the file Dunbrack02.lib.bin if it is present. (It is a locally-generated binary version of the text version (the bbdep02 file) which loads in fractional seconds instead of many seconds)

Wed, 2011-01-26 18:09
smlewis

Hi smlewis,

Thanks! This worked perfectly! The rosetta database that I was using didn't pass the checksum, so I switched to one that did have a correct checksum, and this problem cleared itself up.

Next problem! I ran the AbinitoRelax code again, using modified a FASTA file and the code exited with this error:

ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 26 fold_tree nres: 32

I have modified the .tetraL.fa_cst, .cen_cst files, but don't know what to do with the .residue_pair_jump_cst file, as I'm not too sure what the entries mean.

For your information, the metal binding moiety that I want to use is tetrahedral binding around the metal, with 4 cysteines coordinating the metal atom. These cysteines are at positions 10,11, 19 and 22 in from a primary sequence of 25 aa plus Fe. (26 total residues).

Here is the .residue_pair_jump_cst file as it stands now (unchanged from the original in the metalloprotein_abrelax demo):

BEGIN
jump_def: 5 32 31 31
aa: CYS ZN
cst_atoms: SG CB CA ZN V1 V2
jump_atoms: N CA C ZN V1 V2
disAB: 2.20
angleA: 68.0
angleB: 70.5
dihedralA: -150.0 -120.0 -90.0 -60.0 -30.0 0.0 30.0 60.0 90.0 120.0 150.0 180.0
dihedralAB: -150.0 -120.0 -90.0 -60.0 -30.0 0.0 30.0 60.0 90.0 120.0 150.0 180.0
dihedralB: 120.0
END

Here is the tetraL.fa_cst file:

AtomPair SG 10 V1 26 HARMONIC 0.0 0.2
Angle CB 10 SG 10 ZN 26 HARMONIC 1.95 0.35
AtomPair SG 11 V2 26 HARMONIC 0.0 0.2
Angle CB 11 SG 11 ZN 26 HARMONIC 1.95 0.35
AtomPair SG 19 V1 26 HARMONIC 0.0 0.2
Angle CB 19 SG 19 ZN 26 HARMONIC 1.95 0.35
AtomPair SG 22 V2 26 HARMONIC 0.0 0.2
Angle CB 22 SG 22 ZN 26 HARMONIC 1.95 0.35

And the cen_cst file:

AtomPair CB 10 ZN 26 HARMONIC 3.2 0.2
AtomPair CB 11 ZN 26 HARMONIC 3.2 0.2
AtomPair CB 19 ZN 26 HARMONIC 3.5 0.2
AtomPair CB 22 ZN 26 HARMONIC 3.2 0.2

Thanks! and all the best,

--Buz

Wed, 2011-01-26 19:41
buzb

Hi again,

Here's the next part of my response. I am still confused as to how I substitute Zn for Fe in the input metalloprotein demo. As I understand the metalloprotein_abrelax demo, the only structural input that is needed for the program is the primary sequence of the protein whose structure is to be predicted and the location of the metal binding sites in this demo.

Could I modify the fasta file (this is based upon the fasta file included in the metalloprotein_abrelax demo) as follows:

DTHFPICIFC[CYZ]C[CYZ]GCCHRSKC[CYZ]GMC[CYZ]CKTZ[FE]

In addition, what changes need to be made to the tetraL.fa_cst, tetraR.fa_cst and residue_pair_jump_cst files?

Thanks! and all the best,

--Buz

Wed, 2011-01-26 17:19
buzb

We are wayyyy out into "wild guess" territory here.

The modification to the fasta will probably work if FE is the full residue type name for the iron atom. That name will appear near the top of the params file (you did find that, right? It looks like FE to me in my copy of the database).

For the constraint files, you'll want to replace the ZN atom names with FE atom names. The cysteine SG's will have the same atom names. You will need to replace all the residue numbers as well if your protein has a different number of residues. The constraint file documentation (http://www.rosettacommons.org/manuals/archive/rosetta3.2_user_guide/cons...) is frankly some of our best; if you can't figure it out from this let me know. You will also need to fix the geometry of the constraints, if a tetra-cysteine iron does not have the same geometry as a zinc. It may be easiest to draw a picture of what the current constraints say then figure out what to tweak.

Wed, 2011-01-26 18:15
smlewis

Hi smlewis,

Sorry, I am being a total newb. Where do I find the .params file? Is it in the rosetta_database directory?

BTW, that documentation looks great!

Thanks! and all the best,

--Buz

Wed, 2011-01-26 19:44
buzb

rosetta_database/chemical/residue_type_sets/fa_standard/residue_types/metal_ions/FE.params. You might not have the intervening metal_ions folder (but I think you will).

Thu, 2011-01-27 06:56
smlewis

Hi smlewis,

I think that I have figured out the changes to the tetraL.fa_cst, tetraR.fa_cs files, but I am not sure what to do with the residue_pair_jump_cst files. I can't find any documentation about it.

Thanks! and all the best,

--Buz

Thu, 2011-01-27 07:24
buzb

This is getting a little garbled, I seem to have lost track of one of your replies due to post nesting. We're beyond wild guess territory into "I'm making stuff up based on fragmentary understanding of glances at code I didn't write".

It sounds like you should have 26 residues, and your Conformation object has 26 residues, but your FoldTree has 32. The culprit is almost certainly the jump_def line at the top of the residue_pair_whatever file (notice the 32).

Poking around in the (3.1) code for the flag jumps::residue_pair_jump_file, I track the code to protocols/jumping/ResiduePairJumpSetup.cc, function read_file. It's not likely changed in 3.2.

It looks like it parses the jump_def line as jump start, jump end, cut start, cut end.

Jumps define nonbonded connections between atoms; here it is tethering the metal to the rest of the protein (Rosetta doesn't consider metal-ligand bonds.) So, your cut start/cut end will be 25.

Your jump end will be the metal (26).

Your jump start will be any of the cysteines; they appeared to use the first one in the sample.

You'll also need to modify the ZNs to FEs in that file; the V1/V2 virtual atoms are probably okay. You may need to modify the geometry.

Thu, 2011-01-27 10:58
smlewis