Rosetta Commons Research Experience for Undergraduates
A Cyberlinked Program in Computational Biomolecular Structure & Design
Interns in this geographically-distributed REU program have the opportunity to participate in research using the Rosetta Commons software. The Rosetta Commons software suite includes algorithms for computational modeling and analysis of protein structures. It has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes.
- One week of Rosetta Code School (June 5 through June 9) where you will learn the inner details of the RosettaPython code and community coding environment, so you are fully prepared for the summer!
- 8 weeks of hands-on research in a molecular modeling and design laboratory, developing new algorithms and discovering new science.
- The summer will finish with a trip to the Rosetta Conference in the gorgeous Cascade Mountains of Washington State, where you will present your research in a poster and connect with Rosetta developers from around the world. The conference will be held from August 5 through August 8.
- This program is supported by NSF (Award 2244288). Interns will receive housing, paid travel expenses, and a $6,000 stipend.
- Program dates are June 3, 2024 - August 9, 2024.
Include the following in the application:
- Personal statement - why this internship interests you - brief summary of research and computing experience - why you are an appropriate candidate for the internship.
- Two references (complete the reference forms, in the application with contact information)
- Select top five labs and projects of interest from the list below.
- Deadline for receipt of applications is February 1, 2024.
- Deadline for receipt of recommendation letters is February 4, 2024.
- Click on Creating a Competitve REU Application for help with preparing your application.
- Program contact: firstname.lastname@example.org
- U.S. citizens, permanent residents, and U.S. nationals, are eligible to apply.
- International students, who are actively pursuing a bachelor's degree in the United States, are eligible to apply.
- International students studying outside of the US are not eligible to apply.
- College Sophomores or Juniors are preferred.
- Major in computer science, engineering, mathematics, chemistry, biology, and/or biophysics
Available during the program dates, June 3, 2024 - August 9, 2024.
- If accepted, students on the quarter system can request to take their final exams early or request have final exams proctored at Johns Hopkins.
- Interest in graduate school
- While not required, we seek candidates with some combination of experiences in scientific or academic research, C++/Python/*nix/databases, software engineering, object-oriented programming, and/or collaborative development (git)
- **Students graduating before the start of the program are not eligible for the REU and are encouraged to apply to our RaMP Program.
Available projects and locations:
Baker Lab @ University of Washington in Seattle, WA
"Protein design using generative models"
Students will learn cutting edge deep learning protein design methods, and apply them to current design challenges. Areas of focus include de novo enzyme design and de novo binder design.
Cheng Group @ Merck & Co. in San Francisco, CA
"Predictive models for binding and developability of antibodies"
In antibody drug discovery, two important goals are to improve antigen binding while reducing antibody self-interactions, and modeling is useful in prioritizing engineering efforts. We have generated large datasets along with homology models and conformational ensembles for each antibody in the dataset. The successful student will leverage Rosetta to generate structure-based descriptors and use them in building predictive machine learning models. The student will work with Merck & Co. scientists to assess the advantages of derived predictions alone and in combination with state-of-the-art predictive approaches.
Cooper Lab @ Northeastern University in Boston, MA
“Crowdsourcing protein folding and design”
We are exploring how citizen science and crowdsourcing through video games can help biochemists with their work. To do this, we have developed the game Foldit, a multiplayer online game that allows players without previous experience in biochemistry to work on protein folding and design problems. This project will focus on development of game-related aspects to understand and improve the player experience. Potential projects include virtual reality, procedural content generation, and dynamic difficulty adjustment.
Das Lab @ Stanford University in Stanford, CA
“ High resolution RNA design”
The ability to design RNA binders of proteins, small molecules and DNA/RNA would transform development of mRNA vaccines, CRISPR therapies, and other RNA-based medicines. This project will explore diffusion based approaches to this unsolved problem.
Glasgow Lab @ Columbia University in New York, NY
"Computational design of allosteric protein therapeutics"
Perturbations like mutations, binding interactions, and post-translational modifications (PTMs) can change the structural dynamics of a protein, which affect its biomolecular interactions, stability, other PTMs, and catalytic activity. Such structural changes affect protein function, which manifests as aberrant metabolism and disease progression. Understanding how perturbations drive conformational changes in proteins is necessary to characterize dysregulation in disease, but these changes are very difficult to observe, and there are no methods to predict them. Accurate predictions of perturbation-driven conformational changes in proteins would enable the discovery of currently invisible disease mechanisms and the design of highly specific therapeutics. We are developing a method to predict how mutations and ligand binding impact protein conformational ensembles towards uncovering the missing link between mutations and disease phenotypes. We first computationally design a library of barcoded and mutagenized proteins, and then measure the conformational dynamics of library members using a high-throughput hydrogen-deuterium exchange with mass spectrometry (HDX/MS) strategy that we are developing. Using the resulting dataset, we will build a machine learning model that can predict the effects of mutations on the conformational states of proteins.
Gray Lab @ Johns Hopkins University in Baltimore, MD
“Antibody engineering by deep learning”
Antibodies are an excellent model system for loop structure prediction and design, which remain difficult problems. Deep learning has improved high-resolution loop modeling but antigen docking remains challenging, likely due to the lack of multiple-sequence alignments. In this project, the student will combine emerging deep learning models and create and test deep generative and variational models toward designing developable, high-affinity binders for specific epitopes. The REU participant will learn antibody engineering, homology modeling and docking, and machine learning.
Karanicolas lab @ Fox Chase Cancer Center in Philadelphia, PA
“Designing targeted protein degraders”
PROTACs (PROteolysis TArgeting Chimeras) are a new approach to eliminate activity of a given protein in cells. Rather than inhibiting the protein of interest, PROTACs completely eliminate the target protein by inducing its degradation. PROTACs are bi-functional small molecules that use a chemical linker to join a “warhead” directed against some target protein with a moiety that recruits an E3 ubiquitin ligase. In this project, the student will apply docking to build models of several PROTACs in complex with their target proteins and E3 ubiquitin ligases, then will use these as input to develop a machine learning approach for predicting the efficacy of or designing a given PROTAC.
Khare lab @ Rutgers University in New Brunswick, NJ
"Designing stimulus-responsive enzymes for targeted chemotherapy"
The ability to design stimulus-responsiveness into any enzyme of choice would aid in our ability to interrogate and intervene in biological processes with exquisitely high spatial and temporal precision. This project focusses on developing, testing and improving pro-drug activating enzymes that can be used to better target therapeutic delivery. The participant will use computational design, high-throughput experiments and machine learning-based approaches to obtain enzymes that can be activated by external stimuli such as light, or by the presence of tissue-specific molecules.
King Lab, University of Washington in Seattle, WA
“ ML-based antigen and nanoparticle vaccine design”
Protein design is a powerful tool for designing better vaccines, because many types of information that are important to the immune system can be encoded in amino acid sequences (e.g., antigen conformation, repetitive antigen display on self-assembling proteins, etc.). Recent advances in machine learning tools for protein structure prediction and design (e.g., AlphaFold2, RoseTTAFold, ProteinMPNN, and RFdiffusion) have opened up new possibilities in antigen and nanoparticle vaccine design. The King lab has deep expertise in both areas and has developed multiple nanoparticle vaccines that have entered clinical trials, with one (SKYCovione) recently being licensed for use in multiple countries. Come join us at the Institute for Protein Design to take advantages of the many exciting opportunities in this space!
Khmelinskaia Lab @ Ludwig Maximilian University of Munich in Munich Germany
"Expanding the functional space of de novo designed protein assemblies"
Computational methods have been recently developed for designing novel protein assemblies with atomic-level accuracy, yet several aspects of current methods limit the structural and functional space that can be explored. We aim to expand the plethora of available protein assemblies for application by introducing and controlling new structural properties (e.g. flexibility, assembly stability, structural switches) and functional moiteties (e.g. catalysis, recognition elements, surface binding). Students will have the chance to learn computational protein modelling and design methods, from Rosetta to the latest AI-based algorithms, in the beautiful LMU campus in the south of Munich.
Kortemme Lab @ University of California, San Francisco, in San Francisco, CA
“Computational design of de novo proteins to control biological signaling ”
We are working towards engineering synthetic signaling systems built from de novo designed protein components that can reognize inputs, transduce signals, and control programmable outputs. We have a range of projects to create proteins with custom-designed shapes to recognize specific signals, and to engineer switchable protein structures. We integrate computational design, including recent advances from deep learning, and experimental characterization in vitro and in cellular systems.
Kuhlman Lab @ University of North Carolina, Chapel Hill in Chapel Hill, NC
“Design of protein switches and complexes”
Enzymes can dramatically increase the rate of chemical reactions because they bind with high affinity to the reaction transition state. The goal of this project is to develop new machine learning methods for designing small molecule binding sites in proteins. These methods will be useful for creating novel enzymes and biosensors.
Lindert Lab @ Ohio State University in Columbus, OH
" Machine-learning based structure modeling using mass spec data"
Knowledge of protein structure is paramount to our understanding of biological function and for developing new therapeutics. Mass spectrometry experiments which provide some structural information, but not enough to unambiguously assign atomic positions have been developed recently. These methods offer sparse experimental data, which can also be noisy and inaccurate in some instances. We are developing integrative deep-learning based modeling techniques, that enable prediction of protein complex structure from the mass spec data.
Meiler Lab @ Vanderbilt University in Nashville, TN
"Integrating Artificial Intelligence and Protein Structure for Drug Discovery
Focus of this project is the development of new computer algorithms that integrate ligand-based, i.e. AI-driven drug discovery with structure-based methods, i.e. docking within RosettaLigand. The student will be trained in both types of methods and afterwards develop and integrate an AI into RosettaLigand for said task. Several drug discovery application projects in cancer, neuroscience, and metabolic diseases are running in the laboratory to test out the new method in a realistic practical setting.
Merck Discovery Biologics in Boston, MA
“ Design and engineering of therapeutic proteins”
Students will use computational protein design methods (e.g. LLM, diffusion, etc) to predict mutations that stabilize therapeutic proteins, improve affinity, or impart novel function. Students will gain skills in computational protein design and protein structure analysis. Students will have the chance to express and characterize designed proteins and experience the drug discovery process.
Merck Protein Engineering Lab in Rahway, NJ
"Design and engineering of novel enzymes"
Enzymes catalyze a diverse set of chemical transformations with significant rate enhancements and with excellent chemo, stereo, and regiospecificity. These features combined with the fact that enzymes operate in aqueous solution and are typically more environmentally friendly than synthetic catalysts has led to the broad adoption of enzymes in the chemical industries. While enzymes are amazing catalysts, they have evolved to solve the challenges faced by Mother Nature and not the challenges we face today. We use computational protein design and evolution-based methods to engineer and invent new protein functions. This project will leverage our high-throughput automation capabilities with structure-based design and machine learning to engineer enzymes with novel properties. Students will gain experience in computational protein design, machine learning, and wet-lab methods for engineering proteins.
Mills Lab @ Arizona State University in Tempe AZ
" Design of proteins containing functional non-canonical amino acids"
Students will use Rosetta to design proteins in which non-canonical amino acids (i.e., amino acids that do not exist in nature) are used to provide functionalities that would be difficult to achieve using naturally occurring amino acids alone. Current focuses are on the design of functional metalloproteins and fluorescent proteins that could be used as biosensors.
Rocklin Lab @ Northwestern University in Chicago, IL
"Applying high-throughput experimental data to guide computational protein design"
Today, most computational protein design tools like Rosetta use the features of natural proteins structures (which amino acids like to be near each other, what types of structures are very common, etc) to guide the design of new proteins. However, for many applications, we want to design proteins with properties far beyond what already exists in nature. To achieve this, we need new sources of data - not just natural protein structures - that can guide design into new territory. Our lab develops new experimental methods to measure properties like folding stablity, binding affinity, and dynamics for tens to hundreds of thousands of designed or natural proteins at the same time. We then use these new large datasets to guide protein design proteins. We have a range of different focused on basic science, therapeutic development, and tools for synthetic biology. Each person's project is described on our website (www.rocklinlab.org). We will work with an intern or post-bac to find which project in our lab is best for their interests.
Schoeder Lab @ University of Leipzig in Leipzig, Germany
“Designing the next generation of gene and cell protein therapeutics”
Computer-assisted protein design has emerged as a lead technology to design tailored therapeutics and vaccines in recent years. In the Schoeder Lab we leverage structure-based methods and machine learning to design novel therapeutics including antibodies, adeno-associated virus vectors for gene therapy and chimeric antigen receptors for immunotherapy. We combine these computational approaches with experimental validation and biophysical studies. Students will have the chance to gain experience both in computational and wet-lab methods for engineering and characterizing protein therapeutics
Schueler-Furman Lab @ Hebrew University in Jerusalem, Israel
“How do post-translational modifications change the communication of a protein with its partners?”
Would you like to learn more about how interactions that are mediated by short peptide motifs regulate cellular behavior? Join our lab for the summer to work on a project that will involve different deep learning techniques and modeling using Rosetta to characterize motifs in flexible regions of a protein and their interactions with different partners, and to design specific inhibitors for these interactions.
Sgourakis Lab @ Children's Hospital of Philadelphia in Philadelphia, PA
" Structure-guided design of Chimeric Antigen Receptor T-cell therapy"
Chimeric Antigen Receptor (CAR) T-cell therapy has achieved remarkable efficacy for liquid tumors, but therapeutics for solid tumors have been plagued by issues such as the lack of a sustained response. While several promising cancer-specific antigens have been discovered, most proteins are expressed intracellularly and thus are undruggable by traditional CAR-T therapy. In collaboration with the Maris lab at CHOP, we have developed a peptide-centric CAR that targets peptide/MHC-I complexes, enabling targeting of the intracellular proteome. This project will use a hybrid of traditional Rosetta-based methods and machine learning tools to design new CARs that target disease-associated peptide/MHC-I complexes.
Siegel Lab @ University of California, Davis in Davis, CA
"Computational enzyme design and modeling"
The Siegel Lab student undergrad research project involves students in investigating structure-function relationships in enzymes and collecting relevant data for the computational protein modeling and design stakeholder community. Students use computational modeling tools to design novel protein variants, build their variant gene with site-directed mutagenesis followed by sequence verification, they learn to express and purify their variant enzymes, and finally biophysically characterize them with colorimetric kinetic and thermal stability assays. The data generated by the students will be used to train biomolecular modeling software to more accurately predict enzyme function, which remains a holy grail in the field. Improvements in in-silico model accuracy will translate to huge gains in efficiency in the wet-lab to engineer proteins to tackle today’s grand challenges.
Slusky Lab @University of Kansas in Lawrence, KS
“Design of biosensors using machine learning”
The most time consuming step of enzyme design—especially in the case of de novo design or design on previously non-catalytic scaffolds—is experimentally screening dozens or even hundreds of proteins to find the ones that function as intended. We are creating and testing machine learning classifiers to accurately determine which designs will succeed. A successful classifier would dramatically accelerate progress in computational enzyme design and be a significant advance to the state-of-the-art. Designed enzymes could be used for environmental remediation to break down oil spills.
" Building novel drug discovery workflows using computational modeling and machine learning"
This intern will gain industry experience in drug discovery, using and building computational tools to predict various features of protein-protein interfaces and protein-small molecule interactions. The intern will work with a close-knit, cross-functional team including specialists in Rosetta, molecular dynamics, AI/ML, free energy calculations, and software engineering. Projects may include PROTACs and novel peptide therapeutics intended for use as oral medicines.
Yarov-Yarovoy Lab @ University of California, Davis in Davis, CA
"Design of macrocycles, peptides, and antibodies targeting ion channels"
This project aims to design potent and selective macrocycles, peptides, and antibodies as modulators of ion channels and as molecular probes to visualize ion channel activity in live cells. Three recent breakthroughs: (1) high-resolution cryoEM and x-ray structures of ion channels, (2) Rosetta protein design, and (3) AlphaFold protein structure prediction, have together set the stage for design of macrocycles, peptides, and antibodies targeting ion channels. Rosetta Interns will work with an interdisciplinary and collaborative research team and learn how to use Rosetta and AlphaFold to design prototypes of macrocycles, peptides, and antibodies as modulators of ion channels.
"De novo design and characterization of miniprotein therapeutics for the treatment of cancer"
"Using the protein manifold sampler to discover new drugs"
"Vaccine design and antibody design"
"High-throughput prediction of antibody developability from sequence and structural features"
Companies may partner with us and sponsor an intern--click here for more information
Intern Research Posters:
Award Number: 2244288