JUNE 18–22, 2017

Session Details

Name: Tutorial 01: Understanding and Improving I/O performance on HPC systems
Time: Sunday, June 18, 2017
09:00 am - 06:00 pm
Room:   Analog 1
Messe Frankfurt
Breaks:08:00 am - 10:00 am Welcome Coffee
11:00 am - 11:30 am Coffee Break
11:00 am - 11:30 am Welcome Coffee
01:00 pm - 02:00 pm Lunch
04:00 pm - 04:30 pm Coffee Break
Presenter:   Keeran Brabazon, Allinea
  Holger Brunst, Technische Universität Dresden
  Adrian Jackson, EPCC
  Tomislav Šubić, Arctur
I/O is a key part of all applications, whether it be reading in data to start simulations, or writing checkpoint files to protect against hardware failures, or outputting the results of a simulation. As I/O is often infrequent, especially in computational simulation applications that run at scale on HPC resources, it is often neglected when considering application performance and optimisation. However, as we scale to larger HPC systems the fraction of time spent in I/O for applications is increasing. We are also now encountering a new type of application using HPC resources, data intensive applications where I/O is a dominant part of the workload. Therefore, understanding I/O performance for application, and optimising I/O, is crucial in enabling efficient computational simulations. Furthermore, whilst compute resources tend to be exclusively assigned to an individual job on a HPC machine, I/O hardware is shared between jobs that are running, meaning I/O performance can be variable and understanding the I/O performance of an application in isolation is often difficult. This tutorial will address how users can assess the I/O performance and capabilities of the systems they are using, of individual applications, and what parallel I/O software and strategies can be used to optimise I/O.