Rosetta 3.4
Commands for the cluster application


This document was edited May 25th 2010 by Mike Tyka. This application in mini was created and documented by Mike Tyka,et al.

Purpose and Algorithm

The "cluster" application in Rosetta carries out a simple clustering of structures (either PDB or silent file format). The algorithm is based on one of Phil Bradley's old programs (silent_cluster_c). Starting with a subset of structures (the first 400 structures) the algorithm finds the structure with the largest number of neighbors within the cluster radius and creates a first cluster with that structure as the cluster center and the neighbors part of and claimed by the cluster. The structures are removed from the pool of "unclaimed" structures. The algorithm is then repeated untill all structures are assigned a cluster. The remainder of structures are then assigned to clusters (this avoids having to calculate a full rms matrix) one by one. The rule is that any structure joins the cluster to who's cluster center it is most similar to. If the closest cluster is more then "cluster_radius" away the structure will form a new cluster. This rule is applied to all remaining structures. Clusters can be size limited, sorted by energy etc.. (see options)

Command lines

Sample command*

cluster.linuxgccrelease @flags > cluster.log

cluster can take all general file IO options common to all Rosetta applications

   -database                 Path to rosetta databases
   -in:file:s                Input pdb file(s)
   -in:file:silent           Input silent file
   -in:file:fullatom         Read as fullatom input structure
   -out:file:silent          Output silent structures instead of PDBs
   -score:weights            Supply a different weights file (default is score12)
   -score:patch              Supply a different patch file (default is score12)
   -run:shuffle              Use shuffle mode
   -nstruct                  Make how many decoys per input structure ?

Options specific to cluster

   -cluster:radius  <float>                    Cluster radius in A (for RMS clustering) or in inverse GDT_TS for GDT clustering. Use "-1" to trigger automatic radius detection
   -cluster:gdtmm                              Cluster by gdtmm instead of rms
   -cluster:input_score_filter  <float>        Ignore structures above certain energy
   -cluster:exclude_res <int> [<int> <int> ..] Exclude residue numbers from structural comparisons
   -cluster:radius        <float>              Cluster radius
   -cluster:limit_cluster_size      <int>      Maximal cluster size
   -cluster:limit_clusters          <int>      Maximal number of clusters
   -cluster:limit_total_structures  <int>      Maximal number of structures in total
   -cluster:sort_groups_by_energy              Sort clusters by energy.

cluster -database ~/rosetta_database -in:file:silent silent.out -in::file::binary_silentfile -in::file::fullatom -native 1a19.pdb
clustered Poses are given output names in the form of:
c.i.j, which denotes the jth member of the ith cluster.

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines