JUNE 18–22, 2017
FRANKFURT AM MAIN, GERMANY

Presentation Details

 
Name: Fast Matrix-Free Discontinuous Galerkin Kernels on Modern Computer Architectures
 
Time: Tuesday, June 20, 2017
02:45 pm - 03:15 pm
 
Room:   Panorama 3
Messe Frankfurt
 
Breaks:03:15 pm - 03:45 pm Coffee Break
 
Speaker:   Martin Kronbichler, TU München
 
Abstract:   This study compares the performance of high-order discontinuous Galerkin finite elements on modern hardware. The main computational kernel is the matrix-free evaluation of differential operators by sum factorization, exemplified on the symmetric interior penalty discretization of the Laplacian as a metric for complex applications in fluid dynamics. State-of-the-art implementations of these kernels stress both arithmetics and memory transfer. The implementation of SIMD vectorization and shared-memory parallelization are detailed. Computational results are presented for a dual-socket Intel Haswell system at 28 cores, a 64-core Intel Knights Landing system, and a 16-core IBM Power8 system. For moderate polynomial degrees between two and six, the Knights Landing machine is approximately twice as fast as the Haswell system. One core of Haswell is also considerably faster than a Power8 core. For our code, parallelism expressed through for loops shows better performance than task-based parallelism with dynamic scheduling according to dependency graphs on medium to high core counts, despite less memory transfer in the latter algorithm.