You are here

Strong bias in sampling observed in RosettaDock

2 posts / 0 new
Last post
Strong bias in sampling observed in RosettaDock

Dear Rosetta Community,

I am interested in flexible blind/global docking of a monomer to generate homo-dimers. For this I have used the RosettaDock application with -ensemble1 and -ensemble2 flags after passing the prepacked structure to -in file:s. I observe a strong bias in the output population of the docked dimers depending on what I pass in -in file:s. The docking commands are as follows:

mpirun -np $PBS_NTASKS $ROSETTA_BIN/docking_protocol.mpi.linuxiccrelease -database $ROSETTA_DB -in:file:s $infile \
-randomize1 -randomize2 -spin \
-nstruct $nstruct \
-ensemble1 list_en -ensemble2 list_en \
-dock_pert 3 8 -dock_mcm_trans_magnitude 0.1 -dock_mcm_rot_magnitude 5.0 \
-docking_low_res_score motif_dock_score -mh:path:scores_BB_BB $ROSETTA_DB/additional_protocol_data/motif_dock/xh_16_ -mh:score:use_ss1 false -mh:score:use_ss2 false -mh:score:use_aa1 true -mh:score:use_aa2 true \
-use_input_sc \
-low_res_protocol_only \
-score:docking_interface_score 1 \
-out:file:scorefile $name'.sc'  \
-out:path:pdb ./$name'pdbs'/. \
-ignore_zero_occupancy false \
-show_accessed_options \
-protein_dielectric 8 -water_dielectric 80 \
-out:level 300 -mute protocols.docking -mpi_tracer_to_file $name'.log' >> $name'.options'


I have 5 monomer conformations of my protein for global flexible docking. These conformations are named as 1.pdb.ppk , 2.pdb.ppk , 3.pdb.ppk , 4.pdb.ppk, 5.pdb.ppk. 

In this case I have used a dimer with 1.pdb.ppk--1.pdb.ppk conformations in a random docked pose as an input to -in file:s. My observation regarding the generated ensemble of docked poses is that a majority of these docked poses contain 1.pdb.ppk as one of the docked parnters. The results are as follows:

                                                                                            Conformations in the dimer    Population

1-1      33351

1-2      12749

1-3      12187

1-4      13119

1-5      8304

2-2      1558

2-3      2816

2-4      3303

2-5      1918

3-3      1339

3-4      3051

3-5      1809

4-4      1823

4-5      2103

5-5      597


In another case, very similar results are observed when I do the same but with 4.pdb.ppk--4.pdb.ppk as an input docked pose to -in file:s. Strong bias towards docked poses containing 4.pdb.ppk as one of the binding partners is observed:

                                                                                              Conformations in the dimer    Population

1-1      3336

1-2      4906

1-3      5856

1-4      33128

1-5      6040

1-6      4871

2-2      1876

2-3      4635

2-4      24385

2-5      4518

2-6      3441

3-3      2916

3-4      30597

3-5      5881

3-6      4327

4-4      96004

4-5      30952

4-6      22960

5-5      3149

5-6      4507

6-6      1728

I have observed this in other similar test cases too.  Is it due to the change in the FOLD TREE operation depending on  what I input to -in file:s ? Or is there a mistake somwhere ? I would like to know what I should be doing to avoid any sampling issues.


Post Situation: 
Thu, 2020-08-27 22:23


That's a great observation! It makes sense if the conformations have relatively large conformational changes because the conformer swapping happens by aligning the interface. If your conformations are quite different from one another, it's likely that the new conformations that were swapped in have clashes and hence the Monte Carlo acceptance rates are close to zero. In general, we do not recommend global sampling with conformational switching; the sample space is far too large, even 300K decoys might not be enough. We never tested it, so we don't know what the corect values would be. A better approach would be to do the global docking using one backbone FFT (say ClusPro). Then, on the top 10 or so states, use local docking with conformational ensembles.



Sun, 2020-09-20 15:50