TY - JOUR TI - Towards frameworks for large scale ensemble-based execution patterns DO - https://doi.org/doi:10.7282/T39Z96QV PY - 2015 AB - Towards Frameworks for Large Scale Ensemble-based Execution Patterns by Vivekanandan Balasubramanian Thesis Director: Dr. Shantenu Jha A major challenge in the field of chemical sciences is to bridge the gap between the ability to study matter at an atomic scale and to predict how these details behave and impact at a macroscopic scale. Molecular Dynamics (MD) simulations are a powerful tool for the study of macromolecular systems as they provide the ability to compute thermodynamic and kinetic parameters accurately. In order for MD simulations to be effective, they must adequately sample, e.g., effi- ciently and accurately sample all conformational space for a molecule. A long standing debate in the MD community has been around approaches to effective sampling: for a given amount of compute time, which is likely to guarantee better sampling: a single long-running simulation or multiple smaller simulations? In this project, we provide support for the scenario when multiple small simulations are to be used. Another important requirement faced by MD community is the need for iterative simulation and analysis stages, from which data can be extracted and, based on prob- ability densities and weights, new set of trajectories can be generated. These scenarios result in the following computational challenges: (i) Executing large number of tasks concurrently, (ii) Support for heterogeneous resource, (iii) Effective data movement that is correlated with stage transitions. iiTo address these requirements, we developed the Ensemble MD Toolkit (EnMDTK). The EnMDTK has three primary design features: (1) Fundamental support for multiple concurrent simulations, (2) Support for different ensemble-based execution patterns, and (3) Execution Plugins which abstract, from the user the challenges /difficulties of managing the execution of these patterns on heterogeneous systems. The toolkit enables scientists to easily and scalably develop their own applications using pre-determined patterns supported by the EnMDTK. Our contribution to this project is the development, testing and documentation of the specific iterative Simulation-Analysis pattern of EnMDTK. The development re- quired extensive study and testing of the underlying RADICAL-Pilot framework in order to use it in the most effective form. Knowledge of the individual kernels, work- ing and their specific dependencies were required for their deployment on the various resources. Testing revealed certain optimizations that could be done with the data movement. Performance characterization of the final toolkit was also done. KW - Electrical and Computer Engineering KW - Molecular dynamics--Computer simulation KW - Macromolecular Systems LA - eng ER -