Application optimization with Cache-aware Roofline Model and Intel oneAPI tools
In this tutorial, we will introduce the Cache-aware Roofline Model (CARM) and expose its basic principles when modelling the performance upper-bounds of Intel CPU and GPU devices. We will also showcase CARM implementation in Intel® Advisor and demonstrate how we can use it to drive the application optimization. For this purpose, we will rely on epistasis detection as a case-study, which is an important application in bioinformatics. For both Intel CPUs and GPUs, we will show how CARM can be used to detect execution bottlenecks and provide useful hints on which type of optimizations to apply in order to fully exploit device capabilities. The guidelines provided by CARM were fundamental to achieve the speedups of more than 20x when compared to the baseline code. Learning Objective: In-depth understanding of the Cache-aware Roofline model, its construction and interpretation methodology Enable attendees to conduct Roofline analysis of CPU and GPU applications using Intel oneAPI tools Demonstrate how to use the Roofline model to guide and evaluate application optimization efforts Showcase the successful use of Roofline automation when optimizing a real-world bioinfomatics application on both CPU and GPU-accelerated systems