Jared Adolf-Bryfogle email@example.com
Interactive Modelling, multiprocessing protocols, file export, Rosetta setup, exploration, useful Rosetta functions and analysis. Please feel free to add whatever you feel is useful to users and developers alike. For more information, please refer to the doxygen documentation and paper from the RosettaCon2012 Collection.
The PyRosetta Toolkit has not been ported to PyRosetta-4 unfortunately and is only distributed and comparable with PyRosetta-3.
Copy the code from pyrosetta/SetPyRosettaEnvironment.sh to your .bashrc (linux) or .bash_profile (mac). Give the full path where it says PYROSETTA= Source this. Useful to add a shortcut to pyrosetta/ipython.py in your profile.
There is an awesome file you can put in your $HOME directory called .pymolrc . There is a file: PyMOLPyRosettaServer.py . Add this command to your .pymolrc 'run path_to_pyrosetta/PyMOLPyRosettaServer.py' This enables the connection to run every time you open PyMOL, and the GUI is heavily integrated to use this awesome PyRosetta feature.
Run the GUI.
The GUI code exists both in the rosetta source code at
main/source/src/python/bindings/app/pyrosetta_toolkit and in the PyRosetta binary distributions in
app/pyrosetta_toolkit If you have sourced SetPyRosettaEnvironment.sh, an alias is created to launch the GUI using the
If you have not compiled the PyRosetta bindings from the rosetta source code, please download and use the PyRosetta binaries (recommended if not a Rosetta developer). Ignore the GUI in the rosetta source as it requires a full PyRosetta compilation.
If sqlite3 has a problem loading, reinstall python with sqlite3. To do this, usually all you have to do is install sqlite3 and reinstall python. It may be more complex sometimes, so google it, and you'll find a ton of information.
Load a PDB.
There are two things you can load right off the bat. A PDB or a PDBLIST. Lets cover the PDB first. Use the file menu to load or fetch a PDB from the RCSB. A PDB setup window will launch. If it's your PDB that you have loaded into Rosetta before, just select the load button. The Clean/Analyze button will attempt to clean your PDB given the options selected and determine if any residues would be unrecognized by Rosetta.
Once the PDB is loaded into PyRosetta, you have a few choices. The output directory is auto set to the directory you loaded the PDB from, and the output name is the name of the PDB. These can be changed using the upper right corner of the main window. Using file->export->save session you can save this information as well as the PDB loaded for later.
Check out PYMOL.
So, lets take a look at the protein first. Open PyMol if you havn't already. Read the main documentation for instructions on how to setup PyMOL to listen for information from PyRosetta. Use the visualization menu to show the pose in pymol. This sends coordinates to PyMOL. If you want to watch different protocols each time the protein is scored, use the PyMOL Observer checkbox. Note that each time the protein is scored, it will create another frame in pymol. This can be used as a snapshot to save anything you model, or any designs you make. The visualization has options to change the behavior of the PyMOL Observer, as well as to color regions as they are added.
REALLY check out PyMOL.
Click the PyMOL Visualization Window button in the vis menu. Here are a few cool, hopefully useful things. Double click the 'view residue energy' to send energy colors to pymol. Now, explore your protein with the energy components on the right. More options on the left should work as well such as observing all hydrogen bonds.
Set your Options.
The Options menu allows you to configure a variety of Rosetta specific options through the 'Configure Options System' button, including command-line options such as dun10. There are a few common ones, but use the entry box to add your own, and click add option to change the options for this current session. Click save defaults to have it loaded the next time you load the toolkit.
Set your ScoreFunction.
Once again, in the Options tab, click the ScoreFxn Control button. Here you can change the scorefunction, and also edit individual score terms using the bottom half. Double clicking either the score function or patch sets it for this current session. Double Clicking on a term on the bottom left allows you to change the weight on it, double clicking on a term in the bottom right allows you to change the weight from zero to anything.
Note that this window has it's own menu. You can do a few things, including saving a new scorefunction, or setting a particular score:patch combo as the default. More scorefunctions can be shown by clicking the 'Show ALL scorefunctions' button in the options menu of the scorefunction window. Note that many of these scorefunctions are either undocumented or not used, which is why they are not listed by default. This is currently a project within the Rosetta dev community. The scorefunction you have set will be used by any protocol which uses a scorefunction. talaris2013 is set as the default for now.
Use some Simple Analysis.
Back to the main window, we have some simple analysis for quickly looking at your pose. On the left, we have some basic analysis options, while on the right are the full suite of RMSD tools. Yay rms_util!!!
Take a look, score your pose. At loading the pose, the native pose is set. This is NATIVE RMSD. Lets do some loop min.
CODE: toolkit/window_main/frames/SimpleAnalysisFrame.py CODE: toolkit/modules/tools/analysis.py
Note some analysis movers (Interface, PackStat, VIP, Loop) are now available in the advanced tab.
Set some regions.
On the left bottom, you can select a region. You have start, end, and chain. All of that pose.pdb_info().pdb2pose(start, end, chain) happens in the background, so the GUI is made to work specifically with PDB numbering. You can specify a loop, a residue, a chain, and/or a termini. To add - use the add region button. This will go into the listbox on the left. To remove a region, click the remove region button. Region colors can be sent to PyMOL through the Visualization menu tab.
Ok, so how do we specify more then a loop?
Termini - Specify only the end and chain for Nter floppy tail and and beginning and chain for Cter. Chain - Specify only the chain. Individual Residue - Give the same value for start and end.
CODE: toolkit/window_main/frames/InputFrame.py CODE: toolkit/modules/Region.py
On the bottom right we have some quick protocols. These are meant mostly for modeling, as we all know we need as many processors as possible to do most of our hardcore work. Currently, these are broken down into full protocols and regional protocols. Click the right button to relax the loop. If you want to save the PDB afterward, in the upper right there is a write after protocol button. Clicking this will use the job distributor to write the PDB after the protocol. Now we can use the RMSD analysis to get back info on each loop, total, and an average of all loops.
Note that the chainbreak score is set to 100 for minimization.
Additionally is the protocols tab in the menu. Here you can specify the number of processors you wish to use, as well as run a few other protocols, and open up a web browser to many different webservers from the Rosetta community.
CODE: toolkit/window_main/Frames/QuickProtocolsFrame.py CODE: toolkit/modules/protocols CODE: toolkit/modules/protocols/ProtocolBaseClass.py CODE: toolkit/modules/protocols/MinimizationProtocols.py
Since I am in Roland Dunbrack's lab, I integrated SCWRL into the GUI. If you want the MOST accurate packing of sidechains shown in the literature, that's where you would use SCWRL. To get it working in the GUI, cd into the SCWRL directory, go [ http://dunbrack.fccc.edu/scwrl4/ here ] to download the specific OS version you need, and install it in the given directory. Now you can use the pack (SCWRL) option to pack your chain, loop, residue, etc. just like you did before. Or, you could output a SCWRL sequence file from the export menu. This is a file with the full sequence of the protein and the residues to be repacked in CAPS.
CODE: toolkit/SCWRL/ CODE: toolkit/modules/tools/output.py CODE: toolkit/modules/protocols/MinimizationProtocols.py
Clicking the protein design menu option and then the design file toolbox opens an interactive way to make Resfiles. I find it's much easier to use this then by hand. Specify a residue, and you are given some biochemical values. These are old, but remain heavily used. If you don't know what they are, read about them through the help menu. Once you choose a residue, the first listbox on the left changes. It has info on the conserved residues to make specifying these easier. Exploring the listbox populates the second listbox. Three types are available - individual residues, ALL, and ALL+Self. Double clicking these will populate the last listbox. These are the current residues you have set for design for each residue. Double clicking any residue in the final listbox will remove it from your design. You can click next residue to then modify the next one. Finally, instead of one residue you can specify a region like this 23:42 and then add a residue to the design. All of these will be set. Chains and termini are not currently implemented. You can then save a resfile, clear it, or clear everything for a specific residue. Loading a new pose resets the dictionary of designs.
Adding NCAA's to this window is in the works.
CODE: toolbox/modules/definitions/restype_definitions.py CODE: toolbox/window_modules/design_window/resfile_design.py
Full Control Window
Sometimes, you just want to design single residues, pack single residues, or even modify the dihedral angles of specific residues to make a model, or get something (like a termini) ready for Rosetta. You also may be interested in rotamer probabilities and energies of a particular residue. Here is where this window comes into play. It is located in the Advanced tab of the menu. After specifying a residue, you can first change the dihedrals using the entry box or the slider. No changes are made until you click the delta button. Double clicking a residue type in the middle listbox will mutate that particular residue using one of the PyRosetta convenience functions. On the right, you can then repack the residue, or relax it, or relax that residue and it's neighbors. Note that in this situation, the backrub move would be pretty awesome, and I will be implementing it in the future. You can also observe individual energies terms of the residue by clicking one of the terms in the bottom listbox. This grabs the terms using the ScoreFxn object as before, and IS weighted! This can be useful if your looking for a design that lowers a specific energy term. I know, very specific, but sometimes, just what you need! Clicking the next residue or previous residue will update all energies and probabilities in this window including the energy term set.
You can also add variants to your pose here. Yay phosphorylations!
CODE: toolbox/window_modules/full_control/FullControl.py CODE: toolbox/window_module/scorefunction/ScoreFxnControl.py
Last of the major functions I have had the time to implement are the PDBLIST functions. I do all of my work with PDBLIST of full paths. Makes things easier, and having the full paths means I can work with them in multiple places and multiple scripts. Some users and modelers on the other hand, might not have the knowledge to really work with these or even make one. This is for them, and when I'm feeling a bit lazy, honestly. You can specify a PDBLIST in the input part of the main window. You can also create one in the menu. It will create a file called PDBLIST.txt in the directory specified and set the filenmane as this. Few things you can do now. In the ScoreFxn window, you can rescore all the files and output a file with PATH : SCORE. Next I want to integrate the score filters for users. In the menu, you can output a FASTA file for each PDB, or even for the region specified, which is kinda nice for sequence Logos, etc. You will be able to use calibur in a few weeks, and right now, you can convert all of the PDB's to an SQLITE3 DB. This is basically a way to store a large number of PDB's for analysis, etc. Note that it is not the Rosetta DB structure, it is my own. Will have the option to convert them the Rosetta way in the future. Not sure how to do the conversion in PyRosetta yet.
CODE: toolbox/modules/tools CODE: toolbox/modules/calibur.py CODE: toolbox/modules/PDB.py
Rosetta Tools Window
In the Rosetta Tools window is the GUI that I built, independent of this program and PyRosetta. It has a if __name__=__main__ function, so you can run it by itself if you feel inclined. It's intent was to be an interactive way to create Rosetta options files, since I can't remember names, and I kept forgetting options for the many Rosetta apps that I use. In addition, I wanted a GUI for modellers in my lab to use Rosetta, since for some reason they are a bit scared of the command line. Next, I wanted it to be run on the cluster, so I could quickly set up a Rosetta run at 3 am. Current Implementation: If you open the window and click the repopulate menu item, there are 4 options. Default option is to load option information and descriptions from doxygen. The second option is a manually curated list of options, descriptions, and documentation. Next option is ALL. This stands for ALL options of each program found in the app directory specified. It runs the -help function of each app, writes a file, and parses that file as neatly as I could. So much information, but sometimes, you want it all. You can return the info for just the specific app selected in the help menu. After repopulating, or even when you open the window, you can choose an app, and the options are listed. Clicking the add_option, will add that the textbox at the bottom. You can edit this. Clicking showpath builder will allow you to look for files, add names, etc. This options file can be saved/loaded/and run. On the right is the documentation of each app, since I am constantly on rosettacommons re-reading the documentation every time I start using new apps. Next, we have the cluster part. This is supposed to be used for all apps, as the ones that don't use JD2 can be frustrating to set up for the cluster. It generally uses Qsub to run on the cluster. If you have MPI, use that instead.
Here are a few things and links useful to users and you guys. The license, about functions, etc. If something in the GUI is less then explanatory, please let me know and we can add it to the help menu.
=== Organization: ===
=== Global Variables ===
How window_main modules work
=== Resources ===
=== Basic Program ===
=== Useful Functions ===
filename = tkFileDialog.askopenfilename(title="", message="", initialdir=global_variables.current_directory)
filename = tkFileDialog.asksaveasfilename()
my_float = tkSimpleDialog.askfloat(title="float", prompt="Please enter...", initialvalue=10.0)
my_integer = tkSimpleDialog.askinteger
my_string = tkSimpleDialog.askstring
result = tkMessageBox.askyesno(prompt = "proceed?")
=== Tips ===
if not result:return
input_class.options_manager.re_init()on each processor to generate a new seed for that particular instance.
=== Main Window ===
If you want to add another section to the main window, put the file in /window_main. It should have three main functions, and inherit from the Tkinter Frame class.
def __init__(self, main, toolkit, **options):
=== Menu Items ===
=== Windows ===
=== Existing Code ===