Malleability Techniques for HPC Systems Jesus Carretero (1) (1) Universidad Carlos III de Madrid ABSTRACT The current static usage model of HPC systems is becoming increasingly inefficient due to the continuously growing complexity of system architectures, in combination with the increased usage of coupled applications, the need for strong scaling with extreme scale parallelism, and the increasing reliance on complex and dynamic workflows. Malleability techniques, for both HPC systems and applications, allows to adjust resources usage dynamically to extract a maximum of efficiency. In this talk we will present FlexMPI, a tool being developed in the ADMIRE project, that provides an intelligent global coordination of resources usage at the application level. FlexMPI considers runtime scheduling of computation, network usage and I/O across all components of the system architecture, that it is able to optimize the exploitation of HPC and I/O resources, while at the same time minimizing the makespan of applications in many cases. FlexMPI provides facilities such as application world recomposition to generate a new consistent state when processes are added or removed to the applications, data redistribution to the new application world, and I/O interferences detection to migrate congesting processes. We also present an environmental use case co-designed using FlexMPI. The evaluation shows its adaptability and scalability. ================================================================================================================ Algorithm and software overhead: a theoretical approach to performance portability Valeria Mele (1), Giuliano Laccetti (1) (1) University of Naples Federico II, Italy ABSTRACT In the last years, the "portability" term has enriched itself with new meanings: research communities are talking about how to measure the degree to which an application (or library, programming model, algorithm implementation, etc.) has become "performance portable". The term "performance portability" has been informally used in computing communities to substantially refer to (1) the abil- ity to run one application across multiple hardware platforms; and (2) achieving some notional level of performance on these platforms. Among the efforts related to the "performance portability" issue, we note the annual performance portabil- ity workshops organized by the US Department of Energy. This article intends to add a new point of view to the performance portability issue, starting from a more theoretical point of view, that shows the convenience of splitting the proper algorithm from the overhead, and exploring the different factors that introduce a different kind of overhead. The aim is to exploit what part of any program is really environment-sensitive and exclude from performance portability formulas everything is not going to change, as theoretically shown. This kind of work is not very far from the idea of "code divergency" proposed by Neely et al. to study the "productivity" that, however, is not in the scope of this paper ================================================================================================================ Benchmarking A High Performance Computing Heterogeneous Cluster Luisa Carracciuolo (1), Davide Bottalico (2-3), Davide Michelino (2-3), Gianluca Sabella (2), Bernardino Spisso (3) 1 CNR - National Research Council, Italy 2 University of Naples Federico II, Italy 3 INFN - National Institute for Nuclear Physics, Italy ABSTRACT Paper describes results of some benchmarking tests aimed to verifying and vali- dating all the solutions implemented during deployment of a HPC heterogeneous resource acquired by the The Datacenter of University of Naples "Federico II" thanks to the funds of the IBiSCo (Infrastructure for Big data and Scientific COmputing) Italian National Project. ================================================================================================================ Gianluca De Lucia (1), Marco Lapegna (2), Diego Romano (1) (1) Institute for High Performance Computing and Networking (ICAR),CNR , Naples, Italy (2) University of Naples Federico II, Naples, Italy ABSTRACT The Edge Computing paradigm promises to transfer decision-making processes based on artificial intelligence algorithms to the edge of the network without the need to query servers far from the data collection point. Hyperspectral image classification is one of the application fields that can benefit most from the close relationship between Edge Computing and Artificial Intelligence. It consists of a framework of techniques and methodologies for collecting and processing images related to objects or scenes on the Earth’s surface, employing cameras or other sensors mounted on Unmanned Aerial Vehicles. However, the computing perfor- mance of the edge devices is not comparable with those of high-end servers, so specific approaches are required to consider the influence of the computing envi- ronment on the algorithm development methodology. In the present work, we pro- pose a hybrid technique to make the Hyperspectral Image classification through Convolutional Neural Network affordable on low-power and high-performance sensor devices. We first use the Principal Component Analysis to filter insignif- icant wavelengths to reduce the dataset dimension; then, we use a process accel- eration strategy to improve the performance by introducing a GPU-based form of parallelism ================================================================================================================ A Generative Adversarial Network approach for noise and artifacts reduction in MRI head and neck imaging Salvatore Cuomo (1), Francesco Fato (1), Lorenzo Ugga (2), Gaia Spadarella (2), Renato Cuocolo (3), Edoardo Prezioso (1), Fabio Giampaolo (1), Francesco Piccialli (1) (1) University of Naples Federico II, Department of Mathematics and Applications, Italy (2) University of Naples Federico II, Department of Advanced Biomedical Sciences, Italy (3) University of Salerno, Department of Medicine, Surgery and Dentistry, Baronissi, Italy ABSTRACT As the volume of data available to healthcare and life sciences specialists prolifer- ates, so do the opportunities for life-saving breakthroughs. But time is a key fac-tor. High-Performance Computing (HPC) can help practitioners accurately ana-lyze data and improve patient outcomes, from drug discovery to finding the best-tailored therapy options. In this paper, we present and discuss an Artificial Intel-ligent methodology based on a Generative Adversarial Network to improve the perceived visual quality of MRI images related to the head and neck region. The experimental results demonstrate that once trained and validated, our model per-forms better with respect to the state of art methods and testing it on unseen real corrupted data improved the quality of the images in most cases. ================================================================================================================ Parallel EUD models for accelerated IMRT planning on modern HPC platforms Juan José Moreno Riado (1), Janusz Miroforidis (2), Ignacy Kaliszewski (2), Gracia Ester Martín Garzón (1) (1) Informatics Department, University of Almería, Spain (2) Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland ABSTRACT Radiotherapy treatments apply high doses of radiation to tumorous cells to break the structure of cancer DNA, trying at the same time to minimize radiation doses absorbed by healthy cells. The personalized design of radiotherapy plans has been a relevant challenge since the beginning of these therapies. A wide set of models have been defined to translate the complex clinical prescriptions to optimization problems. The model based on equivalent uniform dose, EUD, is very relevant for IMRT radiotherapy planning in clinical practice. This way, the expert physicists can tune plans near the prescriptions, solving the optimization problem based on EUD in a trial-and-error process. The gradient descent methods can be applied for solving these models personalized for every patient. However, their computational requirements are huge. So, to facilitate their use in clinical practice it is neces- sary to apply HPC techniques to implement such models. In this work, we have developed two parallel implementations of an EUD model for IMRT planning on multi-core and GPU architectures, as they are increasingly available in clinical settings. Both implementations are evaluated with two Head&Neck clinical tu- mor cases on modern GPU and multi-core CPU platforms. Our implementations are very useful since they help expert physicists obtain fast plans that can satisfy all the prescriptions. ================================================================================================================ Environmental data tiling: store in Cloud, process at the Edge Gennaro Mellone (1), Ciro Giuseppe De Vita (1), Diana Di Luccio (1), Sokol Kosta (2), Raffaele Montella (1) (1) University of Naples “Parthenope”, Naples, Italy (2) Aalborg University, Copenhagen, Denmark ABSTRACT Cloud-based services have proved to be very useful in several research fields, such as engineering, health science, astrophysics, etc. Among these fields, environmen- tal monitoring activities also developed a strong need for cloud facilities to store and manage multidimensional data sets collected by current and future weather observation surveys. In particular, weather forecast models and global sensor networks deal with data sets composed of three or more dimensions. However, these forecasting models require a relatively small data slice of the multidimen- sional input data set to perform analysis in a specific area or time interval. Hence, reducing data dimension for information retrieval is mandatory. We propose a technique to load and retrieve the sliced multidimensional data set on different cloud services such as Amazon Web Service (AWS), Google Cloud Platform, and Microsoft Azure. The experimental results performed on the above cloud services highlight that the proposed method can significantly speed up the process of load- ing and retrieving the data slices compared to working with the entire data set in bulk. We evaluate the performance of the proposed solution in an edge sce- nario using containers hosted by a cluster of Raspberry PIs serving as a variable workload.