Beyond crystal forms

While the broad outlines of the loop mechanism employed by the lac repressor were understood, Schulten wanted to look at the system in more detail. Knowing the structure of the lac repressor protein would enable the development of a detailed description of how it functions--how does the lac repressor grab DNA, bend it into a loop, and hold the loop in place despite its resistance?


Lac repressor protein (shown in red) wrestling with DNA. Various structures (shown as colored ribbons, drawn every 100 picoseconds) develop during the simulation. The lac repressor holds its shape during this process, with only the flexible head groups rotating to combat the strain.

The typical way to determine the structure of a protein is to crystallize it into a rigid, repetitive form and then examine the geometry of the crystal. But in this case, the protein in action is moving and flexible. Attempts to crystallize the protein with the DNA loop in place failed, showing just the protein and the binding sites on the DNA strand, not the actual loop.

"[We] needed to describe the dynamics of the protein," Schulten says. In fact, the paper in which Schulten and two members of his research group , postdoc Alexander Balaeff and graduate student Elizabeth Villa, have described their lac repressor research is titled "What the Crystal Did Not Show." The paper is appearing this spring in the Proceedings of the National Academy of Sciences.

A multiscale approach

Capturing the form of a protein in action is a daunting and computationally intensive task to tackle with molecular dynamics (MD) simulation. The lac repressor is a huge protein; modeling its behavior in a realistic water/salt environment leads to a simulation size of more than 200,000 atoms. Adding the DNA loop increases the simulation to at least 700,000 atoms, a size that could not be sustained for sufficient time even using today's most powerful supercomputers.

To reduce the computational cost of the simulation, Schulten and his group devised a dual, or "multiscale," approach that combines mathematics and computing.

Using the NAMD parallel molecular dynamics code developed by Schulten's group for high-performance simulation of large biomolecular systems, the researchers simulated the motion of the lac repressor protein in a water/salt environment. The motion of the DNA loop was modeled by solving the system of Kirchhoff equations of elasticity, a complex set of differential equations.

"The mathematical description of DNA and the computational modeling of the protein had to communicate with each other," Schulten says. The mathematical model of the DNA loop provided the MD simulation with the forces with which the DNA tried to resist the formation of the loop, while the simulation provided the model with information on how the ends of the loop moved.

Molecular dynamics simulations were carried out at both the Pittsburgh Supercomputer Center (PSC) and NCSA. At PSC, a system of more than 200,000 atoms was simulated for 22.4 nanoseconds using 600 processors; the average production speed was 2.7 nanoseconds per day. The researchers then used the more powerful Mercury cluster at NCSA to simulate a system of more than 300,000 atoms that included a more expansive water/salt environment and allowed them to observe larger conformational changes. On Mercury, 254 processors were used to simulate the system for 17 nanoseconds. The production speed was 2.5 nanoseconds per day, nearly equal to the speed achieved at PSC where more than two times the number of processors were required.

Schulten says the group's simulations on the TeraGrid system at NCSA represent one of the first cases of using a multiscale approach for the description of an important biological system, explaining that this approach required the availability of a computational system that could handle the simulation of such a large protein system.

"We took advantage of the computing power at NCSA," he says.

Go To Page 3