When I am using the design protocols (mostly fixbb design), I'm always wondering how much of the sequence space was effectively searched by the Monte Carlo Simulated Annealing method. For example, in a fixbb design at 13 positions, the output log file said there were totaly 15459 rotamers, and I generated 1000 decoys. So there should be 1000 sampling trajectories of random substitutions from the 15459 rotamers, right? In this case, I noticed the resulting sequences were pretty converged. My questions are:
1) How many substitutions were tried? I found one paper (Kuhlman and Baker 2000) giving an estimated number, about 1 million per Monte Carlo run; and another paper (Leaver-Fay et al 2008) mentioning 200*(number of residues)*(rotamers per residue). Is there any reason why "200" is chosen? Is there any other specific number? If there are some in the Rosetta code, where should I look for?
2) Theoretically, the result should be the same no matter what starting sequence I choose. In another word, starting sequence won't bias the sampling trajectory. Is this always true in the real practices?
3) I guess it is still possible to happen that the trajectories starting from the same sequence were trapped in one local minimum. In this case, the result still seems converged. How to distinguish that from a real global minimum solution? And how to avoid that? Should I randomize the starting sequences of each trajectory?
4) Are there any algorithms or parameters in the Roseta design protocols that explicitly ensure the Monte Carlo Simulated Annealing algorithm effectively samples enough area in the sequence space?