Do we need to relax the structure before doing the fix bb design? Thank you!
I would. The issue is that the energy functions that crystallographers use to build models are slightly different from the one Rosetta uses. Because the Rosetta energy function can be sensitive to slight changes in structure (for atom overlaps and the like), this can mean that a "good enough" structure for the crystallographer can have what looks like terrible clashes to Rosetta. If you use a structure straight from the PDB, these small variations can cause Rosetta to design away from the native sequence, thinking the native amino acid won't fit.
The currently recommended protocol is to use an all-heavy-atom, always constrained relax protocol. See
Nivon et al. "A Pareto-Optimal Refinement Method for Protein Design Scaffolds" http://dx.plos.org/10.1371/journal.pone.0059004
I'm a little biased in this assessment, though. Many design projects (most of the published ones, in fact) have been done with little or ad-hoc structure preparation. Rosetta fixbb should design even without doing a pre-relax, although you'll get more mutations (and the risk that comes along with them) than if you pre-relax.
Does relaxation rely on the residue type?
I think if will cause rosetta bias to the native sequence if we relax the structure before design.
For example, a protein wild-type sequence is
And when relax the structure, the program found the space for W is not enough and it will enlarge the space for this position. After relaxing, we will get a relaxed structure and feed it into fixed program. For the position I mentioned, the fixbb will find that the space for that position is very large so that it needs to put a large residue there. But what happens if we don't know the wild-type sequence and only have the backbone structure? Or even a new fold structure without wild-type sequence. Let say, we remove all the side chain information and rename the residue as ALA or SER. And feed that structure to relax program, I believe the result will be much different.
I am not sure my understanding is correct or not. Please point it out. Thank you
Right - relax will do the relaxation in the current context. This will impose some bias regarding the current residues types. Putting on all atom constraints will minimize this, however. With the all atom constraint protocol I mentioned, the atom movement is very small - less than 0.1 Ang on average. If you pull up the before and after structures they look hardly any different.
Often this slight bias toward the starting sequence is desirable - you know the wildtype sequence is stably folded, whereas each mutation you introduce has a small but not insignificant chance of accidentally destabilizing the protein. Minimizing the mutations to only those that are absolutely necessary tends to increase the solubility of the designs when experimentally tested.
If you don't have a native sequence, though, and are stuck with an all-ALA backbone or the like, doing the relax protocol isn't likely to help much. But chances are that with an all-ALA structure you're not going to have the bad VDW contacts and slightly-off-rotamer sidechains that are typically what causes high scores and issues with redesigns. In that case I'd agree it's pointless to run the constrained relax protocol.
the relax is performed by feed a clean pdb to the program:
relax.linuxgccrelease -database rosetta/rosetta_database [-extra_res_fa your_ligand.params] -relax:constrain_relax_to_start_coords -relax:coord_constrain_sidechains -relax:ramp_constraints false -s your_structure.pdb
"By default relax uses a harmonic constraint with the strength adjusted by coord_cst_stdev". What's the suggestion width and stdev
there is also an script sidechain_cst_3.py to produce sidechain constrain. What's the different between relax.linuxgccrelease and this script? It seems that sidechain_cst_3.py is shown in longer protocol and is not recommanded. Thank you!
The sidechain_cst_3.py script was written when the relax application couldn't yet do the sidechain constraints. That logic has since been moved into the relax application, and it can do the constraint generation automatically. There are some slight variations between how the two schemes work, but nothing that majorly affects typically usage.
As for the coord_cst_stdev, the defaults are the recommended value - don't bother setting it unless you want to play around with things. (Figure 1 of the Nivon et al. paper should show how changing those parameters affects things.) The defaults for the relax application are for no cst width (so using harmonic instead of the flat-bottomed bounded constraints) and for a stdev of 0.5
got it. Thank you. I will choose the short protocol.
One more question:
If I have a dimer and I want to design chain A in the dimer.
should I relax different chain individually or relax all the chains in one pdb? Thank you!
I'd relax everything as a complex. Technically, if you're not changing chain B, you're probably not going to run into those issues where small clashes in distal parts of B will affect the output. However, you may see effects in the interface, where moving B slightly will better fix (i.e. with smaller deviation) a small clash than would just moving A. Also, relaxing on the apo-A structure will change the environment of the interface residues, which may result in them moving to a position where you've now introduced a slight clash with B when you re-generate the holo structure.