AI-oriented HPC services are redefining the way users leverage computational and storage resources offered by supercomputers. While a major focus has been drawn on improving the processing power of HPC-enabled AI platforms, storage solutions have reached a clear stagnation in terms of design. Indeed, storage performance is proven to be a major bottleneck in HPC infrastructures and a pressing research challenge. Current strategies to address this challenge include designing storage-oriented optimizations at the application-level (e.g., caching, indexing, scheduling). However, for storage systems operating in multi-tenancy model, application-level optimizations cannot ensure holistic performance guarantees and fairness, leading to high I/O interference and performance degradation among these applications.
PAStor will redesign current storage solutions and propose a novel generation of programmable and adaptable software-defined storage (SDS) systems. The novel system can meet the performance and scalability requirements of complex HPC infrastructures and data-centric applications in a holistic fashion. This is an ambitious goal that will redefine the storage landscape of modern HPC infrastructures, open new research directions on the storage field, and directly impact related areas such as AI and HPC.
At the moment, a storage solution with characteristics similar to the ones offered by PAStor is unavailable in the HPC market. Moreover, the optimization of storage systems for AI workloads is a pressing research challenge. Also, PAStor is aligned with the Advanced Computing Portugal 2030 strategy. The outcomes of the project will enable MACC to gain access to highly innovative technology and to strengthen its position in the European HPC strategy. The findings identified through this project will help TACC optimizing the design, deployment and management of clusters aimed for exascale computing, a priority area in the US.