Enabling Large-Scale Biomolecular Conformation Search with Replica Exchange Statistical Temperature Molecular Dynamics (RESTMD) over HPC and Cloud Computing Resources

Enabling Large-Scale Biomolecular Conformation Search with Replica Exchange Statistical Temperature Molecular Dynamics (RESTMD) over HPC and Cloud Computing Resources We present the latest development and experimental simulation studies of Statistical Temperature Molecular Dynamics (STMD) and its parallel tempering version, Replica Exchange Statistical Temperature Molecular Dynamics (RESTMD). Our main contributions are i) introduction of newly implemented STMD in LAMMPS, ii) use of large scale distributed cyber infrastructure including Amazon EC2 and the nationwide distributed computing infrastructure GENI, in addition to High-performance Computing (HPC) cluster systems, and iii) benchmark and simulation results highlighting advantages and potentials of STMD and RESTMD for challenging large-scale bio molecular conformational search. In this work, we attempt to provide convincing evidence that RESTMD, combining two advanced sampling protocols, STMD and the replica exchange algorithm, offers various advantages over not only conventional ineffective approaches but also other enhanced sampling methods. Interestingly, RESTMD has benefits over the most popular Replica Exchange Molecular Dynamics (REMD) as an application maximizing its capacity in HPC environments. For example, RESTMD alleviates the need of a large number of replicas which is unavoidable in REMD and is flexible in order to exploit the maximum amount of available computing power of a cluster system. Continuing our recent effort in which RESTMD was implemented with a community molecular dynamics package, CHARMM, and the Hadoop MapReduce, in this work, we report latest development outcomes. First of all, we plugged the implementation of STMD into LAMMPS, one of the most popular public molecular dynamics packages. This is expected to position STMD and RESTMD appealing to investigators from a broad range of life science fields. Secondly, Hadoop MapReduce-based RESTMD is now able to run on Amazon EC2 and the nationwide network virtual organization, the GENI distributed computing environment. Thirdly, in order to find optimized parameters for RESTMD simulations, simulation re- ults using test systems, water and solvated cram bin, were obtained and presented. These results, despite of relatively small sizes and short time scale trajectories, serve to underscore merits and potentials of STMD and RESTMD with respect to the strength in algorithmic advantages as well as efficient utilization ofdistributed resources.