Rosetta 3.2.1 Release Manual |
"linmin" is a **single step** steepest-descent algorithm with an exact line search. That is, the descent direction is the gradient of the energy function (vector of partial first derivatives), and the step is to the true (local) minimum of the function along that line. Repeated invocations of linmin would converge very slowly, which is why no one does it.
Conjugate gradient minimization accumulates a small amount of information from prior steps. It performs (somewhat) better than steepest descent. Rosetta++ included an implementation of FRPR (Fletcher-Reeves-Polak-Ribiere), but it hasn't been ported to MiniRosetta because no one used it.
Quasi-Newton methods accumulate more information about the second derivatives of the energy function in terms of the inverse Hessian. This information is used to modify the descent direction (after the first step) so that its no longer necessarily straight down the gradient, but the minimization converges (much) faster. In Rosetta, this is the "dfpmin" family of functions. Although named for the DFP (Davidon-Fletcher-Powell) update method, it in fact uses the BFGS (Broyden-Fletcher-Goldfarb-Shanno) update method, which is widely regarded as better. A limited-memory variant (L-BFGS) exists but has not been incorporated into Rosetta because the cost of keeping the full N x N Hessian matrix (N = number of DOFs) in memory appears not to be prohibitive, and Paul Tseng found it gave worse solutions.
Unlike linmin, all the dfpmin algorithms are **multi step** algorithms, so they need only be called once to reach the (local) minimum of a function.
"dfpmin" uses an exact line search, and Jim says it requires ~10 function evaluations per line search.
"dfpmin_armijo" uses an inexact line search, where the step along the search direction only needs to improve the energy by a certain amount, and also make the gradient a certain amount flatter (but not necessarily reach the minimum). This ends up being significantly more efficient, as once it gets going only 2-3 function evaluations are needed per line search, and approximately the same number of line searches are needed as for the exact dfpmin.
"dfpmin_armijo_nonmonotone" uses an even less exact line search along the descent direction, so that the step need only be better than one of the last few points visited. This allows temporary *increases* in energy, so that the search may escape shallow local minima and flat basins. Jim estimates this is several times more efficient than the exact dfpmin.
Jim recommends dfpmin_armijo or dfpmin_armijo_nonmonotone.
[This discussion applies only to dfpmin and its variants. Other minimizers may have different notions of tolerance.]