Atom-wise energy terms from nearest N-neighbors

4 posts / 0 new

Last post

Top

Hello PyRosetta users,

Can you please advise me on how I should go about calculating atom-wise energy terms?

Should I simply sum pairwise energies for the nearest N-neighbours or is there a better way of doing this?

Thanks!

Category:

Scoring

Post Situation:

Unsolved

Thu, 2021-03-25 14:49

jasiozaucha

Top

You are talking about the `pyrosetta.toolbox.atom_pair_energy.etable_atom_pair_energies` function? I gave that a go and was happy with the results (used relative to other atoms) but there's a big caveat that a few weights are needed as the values are out of wack. I think that is why the functions docstring/doxygen annotation uses the word "lj_atr" as opposed to "fa_atr", but "fa_elec" is the same word, but the values are not only scaled, but shifted! Note differential treatment of intra and inter makes no difference.

import pandas as pd

# scores per atom

def per_atom_scores_of_residue(pose:pyrosetta.Pose,
                               target_idx:int, 
                               scorefxn: pyrosetta.ScoreFunction)-> pd.DataFrame:
    """
    target_idx is residue pose index, not PDB residue index!
    documentation calls the score_types ['lj_atr', 'lj_rep', 'fa_solv', 'fa_elec']
    which are conceptually similar to ['fa_atr', 'fa_rep', 'fa_sol', 'fa_elec'] ?
    """
    score_types = ['lj_atr', 'lj_rep', 'fa_solv', 'fa_elec']
    stm = pyrosetta.rosetta.core.scoring.ScoreTypeManager()
    get_weights = lambda name: scorefxn.get_weight(stm.score_type_from_name(name))
    residue = pose.residue(target_idx)
    scores = {residue.atom_name(i): {st: 0 for st in score_types} for i in range(1, residue.natoms() + 1)}
    # Iterate per target residue's atom per all other residues' atoms
    for i in range(1, residue.natoms() + 1):
        iname = residue.atom_name(i)
        for r in range(1, pose.total_residue() + 1):
            other = pose.residue(r)
            if r != target_idx:
                weights = {name: get_weights(st_name) for name, st_name in zip(['lj_atr', 'lj_rep', 'fa_solv', 'fa_elec'], ['fa_atr', 'fa_rep', 'fa_sol', 'fa_elec'])}
            else:
                weights = {name: get_weights(st_name) for name, st_name in zip(['lj_atr', 'lj_rep', 'fa_solv', 'fa_elec'], ['fa_intra_atr', 'fa_intra_rep', 'fa_intra_sol', 'fa_intra_elec'])}
            for o in range(1, other.natoms() + 1):
                if r == target_idx and o == i:
                     continue # self to self should be zero...!
                score = pyrosetta.toolbox.atom_pair_energy.etable_atom_pair_energies(residue,
                                                                                     i,
                                                                                     other,
                                                                                     o,
                                                                                     scorefxn)
                for st, s in zip(score_types, score):
                    scores[iname][st] += s * weights[st]
    return pd.DataFrame(scores).transpose()


# regular per residue scores

def pose2pandas(pose: pyrosetta.Pose, scorefxn: pyrosetta.ScoreFunction) -> pd.DataFrame:
    """
    Return a pandas dataframe from the scores of the pose

    :param pose:
    :return:
    """
    pose.energies().clear_energies()
    scorefxn(pose)
    scores = pd.DataFrame(pose.energies().residue_total_energies_array())
    pi = pose.pdb_info()
    scores['residue'] = scores.index.to_series() \
        .apply(lambda r: pose.residue( r +1) \
               .name1() + pi.pose2pdb( r +1)
               )
    return scores

# compare the two
scorefxn = pyrosetta.get_fa_scorefxn()
pose.energies().clear_energies()
scorefxn(pose)
atomic = per_atom_scores_of_residue(pose, 1, scorefxn)
residual = pose2pandas(pose, scorefxn)
# these two differ...
atomic.sum()
residual.iloc[0]

I think the issue may lie with the fact that if you do

pyrosetta.toolbox.atom_pair_energy.etable_atom_pair_energies(pose.residue(1),
                                                             1,
                                                             pose.residue(1),
                                                             1,
                                                             scorefxn)

the values aren't zero, which is saying something. No idea what.

(-0.161725, 913.5442465225759, 1.4615969975432346, 0.9231945155105931)

Probably worth mentioning that 'fa_intra_rep' has a different weight than 'fa_rep', while the rest of the intra values are zero for ref2015

'fa_intra_rep': 0.005,
 'fa_rep': 0.55,

Mon, 2021-03-29 07:04

matteoferla

Top

Many thanks, this is anyway helpful!

Fri, 2021-04-09 05:00

jasiozaucha

Top

I had a play with this again as I needed it.

To keep it simple I made a pose with a Ca++ and Cl- 3 Å apart (viz. below) and run the above function... And I am not sure why I did not spot that simply the e-table scores per atom are 2x and that in my codeblock I had weighted the atomic scores, but not the residual ones. So the `lj_rep` is totally `fa_rep` or `fa_intra_rep`.

# okay Cl in database does not work.
# os.path.join(os.path.split(pyrosetta.__file__)[0], 'database', 'chemical', 'residue_type_sets', 'fa_standard', 'residue_types', 'anions', 'CL.params')
# ERROR: unrecognized atom_type_name 'Cl1p'
# so I made my own.
from rdkit_to_params import Params
p = Params.from_smiles('[Cl-]')
p.rename_atom('CL1', 'CL')
p.dump('CL.params')


pose = pyrosetta.Pose()
import os
params_filenames = pyrosetta.rosetta.utility.vector1_string(1)
params_filenames[1] = 'Cl.params'      
rst = pyrosetta.generate_nonstandard_residue_set(pose, params_filenames)
pose = pyrosetta.pose_from_sequence('X[CA]X[CL]', rst)

d = 3
xyz = pyrosetta.rosetta.numeric.xyzVector_double_t(d,0,0)
pose.residue(2).set_xyz("CL", xyz)
# reset scores the hard way
pose.set_new_energies_object(pyrosetta.rosetta.core.scoring.Energies())

scorefxn = pyrosetta.get_fa_scorefxn()
print(d, scorefxn(pose))


# unweighted after halving:
# lj_atr    -0.084853
# lj_rep     0.134981
# fa_solv    0.066238
# fa_elec   -5.469963
# weighted  after halving
# lj_atr    -0.084853
# lj_rep     0.074240
# fa_solv    0.066238
# fa_elec   -5.469963

The scores are the same for Ca and Cl residues, as there's nothing else and actually makes it obvious why I should have divided by 2 in the first place.

About the self with self score "mystery"... The wacky scores are just those one would get if there was an atom occupying the same space.

pyrosetta.toolbox.atom_pair_energy.etable_atom_pair_energies(pose.residue(1), 1,
                                                             pose.residue(1), 1,
                                                             scorefxn)
# (-0.12, 676.9213096377058, 0.0, 93.53140815508083)

pose = pyrosetta.pose_from_sequence('X[CA]X[CA]')

pyrosetta.toolbox.atom_pair_energy.etable_atom_pair_energies(pose.residue(1), 1,
                                                             pose.residue(2), 1,
                                                             scorefxn)
# (-0.12, 676.9213096377058, 0.0, 93.53140815508083)

Fri, 2021-05-28 02:41

matteoferla

Search form

You are here

Atom-wise energy terms from nearest N-neighbors