You are here



Computational Structure Prediction and Design of Biomolecular Structures

Scientists in this geographically-distributed post-baccalaureate program have the opportunity to participate in research using and developing the Rosetta Commons software. The Rosetta Commons software suite includes algorithms for computational modeling and design of proteins and other biomolecules. It has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes.

This 1 year post-baccalaureate program is aimed at preparing underrepresented minority and/or disadvantaged students to succeed in PhD programs.



  • One week of Rosetta Code School (June 1 through June 5) where you will learn the inner details of the Rosetta Python code and community coding environment, so you are fully prepared to research using the software.
  • Assignment to a Rosetta lab where you will be mentored by a graduate student and faculty member who will guide and foster your research.
  • Participation in the Summer Rosetta Conference in the gorgeous Cascade Mountains of Washington State (August 4 through August 7) and the Winter Rosetta Conference (location TBD in February 2021), where you will connect with Rosetta developers from around the world.
  • Salary, health benefits, and funding for conference travel are included.

PREP Provides:

  • Research experience: Scholars conduct hypothesis-driven research in their Mentor’s lab, with day-to-day guidance by an experienced PhD student or postdoc. Scholars participate fully in weekly lab meetings, attend weekly research seminars in their department, attend a vibrant PhD program retreat and a national conference of their choice.
  • Community: Scholars come together each month for two-hour ‘chalk-talk’ events to present and discuss their research with Peer-Mentors (PhD students, postdocs) and faculty.
  • Project (‘mini-thesis’) meetings: Scholars gain confidence by organizing, preparing for, and convening three one-hour ‘mini-thesis’ meetings with two subject-expert faculty, plus their research mentor and the PREP Director. Scholars benefit both scientifically and professionally by building strong working relationships with multiple faculty members at Johns Hopkins who are experts in their field of interest.
  • Professional training and custom mentoring: Scholars participate in workshops designed to improve their scientific writing skills, and understand ethics in science, and can choose from many other workshops including communication and improvisation. Each Scholar charts an individual development plan with the PREP Director, with custom mentoring both formal (monthly one-hour meetings) and informally as needed.
  • Preparation for GRE or MCAT exam, graduate school applications and interviews.
  • Salary, health, tuition and other benefits


  • Individuals from racial and ethnic groups that have been shown by NSF to be underrepresented in health-related sciences on a national basis.
  • Individuals with disabilities, who are defined as those with a physical or mental impairment that substantially limits one or more major life activities, as described in the Americans with Disabilities Act of 1990, as amended.
  • Individuals from disadvantaged backgrounds
  • U.S.citizens, permanent residents, and U.S. nationals are eligible.
  • Undergraduate major in computer science, engineering, mathematics, chemistry, biology, and/or biophysics
  • While not required, we seek candidates with some combination of experiences in scientific or academic research, C++/Python/*nix/databases, software engineering, object-oriented programming, and/or collaborative development.


  • Resume
  • Unofficial transcript
  • Personal statement that summarizes why you are an appropriate candidate (up to 4000 characters) including:
  • Why this program interests you
  • Brief summary of research and computing experience
  • Research career goals
  • Two recommendation letters, completed recommendations can be sent to
  • Select top three labs and projects of interest from the list below.
  • Deadline for receipt of applications is February 1, 2020.
  • Deadline for receipt of recommendation letters is February 5, 2020.
  • Program contact: Camille Mathis:


Bahl Lab @ Harvard University’s Institute for Protein Innovation in Harvard, MA
"Engineering brighter, more modular methods for labeling reagent antibodies"
Antibody reagents have been the backbone of modern investigative biology for decades, and the state-of-the-art methods for labeling antibodies with fluorescent molecules are also decades old. We are developing new methods to covalently attach fluorescent molecules to antibodies where the distance and geometry between fluorophores is rationally controlled in order to minimize static quenching. Our new approach produces brighter antibodies. Our method is modular, easy to use, and it leverages de novo protein design.

Gray Lab @ Johns Hopkins University in Baltimore, MD
“Antibody engineering by deep learning”
Antibodies are an excellent model system for loop structure prediction and design, a difficult problem in the field. High-resolution models of the loop structure are necessary for successful docking to antigens or for design for improved affinities, yet traditional loop prediction methods have been frustrated on antibody loops because of their extreme variability. In this project, the student will apply deep learning methods, including transfer learning and attention gating to leverage data from a large set of protein structure and focus predictions on the key loop. The PREP trainee will learn antibody engineering, homology modeling and docking, and machine learning.

Kuhlman Lab @ University of North Carolina, Chapel Hill in Chapel Hill, NC
"Designing subunit vaccines for dengue virus"
Dengue virus infects over 400 million people each year worldwide and currently there are no effective and safe vaccines available for the virus. We are using protein design to stabilize components of the major surface protein of dengue virus so that when injected into a person they will elicit antibodies that broadly protect against different serotypes of the virus. The trainee will learn how to use Rosetta to identify mutations that stabilize proteins and will learn how to biophysically characterize the stability of purified proteins.

Lindert lab @ Ohio State University in Columbus, OH
"Protein structure prediction from mass spectrometry data"
Mass spectrometry-based methods such as covalent labeling, surface induced dissociation or ion mobility are increasingly used to obtain information about protein structure. However, in contrast to other high-resolution structure determination methods, this information is not sufficient to deduce all atom coordinates and can only inform on certain elements of structure, such as solvent exposure of individual residues, properties of protein-protein interfaces or protein shape. Computational methods are needed to predict high-resolution protein structures from the mass spectrometry data. This project will develop algorithms within the Rosetta software package that use mass spectrometry data to guide protein structure prediction.

Rocklin Lab @ Northwestern University in Chicago, IL
"Design of ultra-stable protein scaffolds"
Small, de novo designed proteins have the potential for widespread use as therapeutis, vaccines, and diagnostics. Compared with other protein technologies, designed proteins can achieve extreme stability against denaturation, although we do not understand the rules for designing highly stable proteins. We are developing new high-throughput assays to measure resistance to unfolding, aggregation, and degradation for thousands of proteins in parallel. This project will apply these assays to discover new principles for protein stability, and apply these principles to design hyperstable proteins.

Siegel Lab @ University of California, Davis in Davis, CA
"Integrative genomic mining for novel enzyme function"
Enzyme families are naturally diverse in functionality and with the rise of high throughput genome sequencing there are thousands of sequences for almost any enzyme class of interest, but with no known structure or experimentally determined function. We are using a combination of molecular modeling and synthetic biology to identify candidate enzymes within a family predicted to catalyze a reaction of interest and obtaining genes to experimentally characterize for the top candidates. Enzyme applications range from applications in food, medicine, fuels, and chemicals.

Slusky Lab @ University of Kansas in Lawrence, KS
"Enzyme design for bioremediation"
The design of novel enzymes could transform environmental remediation. For example, enzymes that degrade pollutants would be presented on the surface of bacterial cells. The products of the degradation would then be metabolized as carbon sources for the same or other organisms. However, enzyme design is not a solved problem. This project will use new machine learning methods for designing proteins involved environmentally-relevant degradation reactions.

O'Meara Lab @ University of Michigan in Ann Arbor, MI
"Design of bacterial biocatalysts for scarring"
Inflammatory scarring is core problem in a wide range of diseases from heart disease to stroke to cancer. Interestingly, the ChABC enzyme found in some gram-negative bacteria is able to degrade glial scars leading to nerve-regeneration. We have recently had success in using Rosetta to stabilize ChABC to make it a more effective biotherapeutic. Now, we would like to engineer more proteoglycan degrading enzymes from bacteria to tackle other types of inflammatory scarring.