The SuMD (Sampling with Molecular Dynamics) is a sophisticated adaptive sampling technique, developed by researchers Deganutti and Moro in 2017. This innovative method utilizes a tabu-like algorithm aimed at accelerating the simulation of binding events between small molecules, such as peptides, and proteins. Notable studies have highlighted the effectiveness of these simulation techniques, including significant contributions from Salmaso et al. (2017), Bower et al. (2018), Cuzzolin et al. (2016), and Sabbadin and Moro (2014). What makes SuMD particularly appealing is its ability to perform simulations without introducing any energetic bias, thereby ensuring more accurate predictions of molecular interactions.

In essence, SuMD involves conducting a series of brief, unbiased molecular dynamics (MD) simulations. After each simulation, researchers analyze the distances between the centers of massor geometrical centersof the ligand and the predicted binding site. This analysis is conducted at regular time intervals and is fitted to a linear function. If the resulting slope of this linear function is negative, indicating progress toward the target, the next simulation step utilizes the last set of coordinates and velocities. Conversely, if there is no progress (i.e., the slope is non-negative), the simulation is restarted with randomly assigned atomic velocities, allowing for a fresh exploration of potential binding configurations.

Building on the foundation of SuMD, mwSuMD (multivariate Weighted Sampling with Molecular Dynamics) has been introduced to enhance the sampling from a specific configuration. This method employs user-defined parallel replicas, also known as walkers, to conduct multiple short simulations simultaneously, rather than relying on a single brief simulation as in the original SuMD. A key advantage of mwSuMD is that it grants researchers greater control over the total wall-clock time allocated for simulations, as it considers one productive replica for each batch of walkers.

However, to truly optimize mwSuMD, it is ideal to assign one walker per GPU, which necessitates a setup involving multiple GPUs for maximum efficiency. Fortunately, modern multi-threaded GPUs can still utilize mwSuMD effectively, albeit with a minor trade-off in GPU performance. In this implementation for ACEMD (Accelerated Molecular Dynamics), mwSuMD requires several inputs: the initial coordinates of the system provided as a PDB file, the coordinates and atomic velocities derived from the equilibration stage, the topology file of the system, and all necessary force field parameters.

Researchers using mwSuMD can choose to supervise either one (X) or two metrics (X, X) of the simulated system over short simulations seeded in batches called walkers. In cases where only one metric is monitored, users can utilize either the slope of the linear function interpolating the metric values or a score to determine whether to continue the mwSuMD simulation. When supervising two metrics, a specific score is employed for evaluation.

In the current study, several metrics were monitored, including distances between centroids, root mean square deviations (RMSDs), and the number of atomic contacts between selected groups. The choice of these metrics is highly dependent on the specific system and the problems being addressed. For instance, RMSDs are particularly useful when the final state of a system is known, while distance measurements are critical when the target state is not predefined.

Importantly, the decision to either restart or continue mwSuMD after any short simulation is deferred until all walkers in a batch have been collected. The best-performing short simulation is selected and extended by generating the same number of walkers with the same duration as prior steps. For each walker, a score for the supervision of a single metric (designated as SMscore) is computed by taking the square root of the product between the metric value from the last frame and the average metric value across the short simulation.

If the monitored metric is expected to decreasesuch as in cases of binding or dimerizationthe walker with the lowest SMscore continues to the next round. Conversely, if the metric is expected to increase, such as during unbinding or the outward opening of domains, the walker with the highest score is selected for continuation. This emphasis on the SMscore places greater importance on the final state of each short simulation, providing a solid starting point for the subsequent simulations.

For scenarios where both X and X are designed to increase during mwSuMD simulations, the score for supervising two metrics (referred to as DMscore) is calculated based on both metrics from the last frame. This score is designed to reflect the combined progress of both metrics while allowing for some level of independence in their variations. If one of the metrics slows down significantly, the other metric can still influence the systems evolution.

Moreover, in contrast to SuMD, when a walker is extended by seeding a new batch of short simulations, the remaining walkers are halted without reassigning atomic velocities. This allows for simulations as brief as a few picoseconds while minimizing artifacts that could arise from thermostat latency in reaching the target temperature, which typically takes around 10 to 20 picoseconds when velocities are reassigned.

The current implementation of mwSuMD has been executed using Python 3, leveraging the capabilities of the MDAnalysis and MDTRaj modules, both of which are renowned for their efficiency in handling molecular dynamics simulations. This advancement represents a significant step forward in the field of computational biology, offering researchers more precise tools for studying molecular interactions and dynamics.