Lawrence Berkeley National Laboratory drives heterogenous computing with oneAPI’s Math Kernel Library(oneMKL)
oneMKL Random Number Generators Domain now supports Nvidia GPUs
Dr. Vincent R. Pascuzzi
The increasing number of high-performance computing centers around the globe is providing physicists and other researchers access to heterogeneous systems — comprising multiple central processing units and graphics processing units per node — with various platforms. However, more often than not, it is the case that domain scientists have limited resources such that writing multiple implementations of their code to target the different platforms is unfeasible. To help address this, a number of portability layers are being developed that aim to allow programmers to achieve performance portable codes; for example, Kokkos, Raja, Alpaka and SYCL. Nevertheless, portable application programming interfaces often lack some features and tools that are manifest in a platform-specific API.
The oneMKL open-source interfaces project is part of the oneAPI industry initiative that provides DPC++-based APIs for math algorithms for CPUs and compute accelerator architectures. The interfaces provide a cross-architecture solution for speeding up applications with effective and modern linear algebra and pseudorandom number generation functionality familiar to C++ developers. In particular, the oneMKL random number generator domain provides commonly used pseudorandom engines with various continuous and discrete distributions. Random number generators are used almost ubiquitously in stochastic algorithms (most importantly, Monte Carlo simulations) applied in math, science, engineering, financial and other spheres.
Thanks to SYCL’s interoperability functionality and open software, adding support for cuRAND into the oneMKL open-source interfaces project was a straightforward exercise. It is now possible to generate random numbers within SYCL applications on Nvidia GPUs using kernels optimized for these devices. By utilizing existing optimizations, nearly native performance in cross-platform applications written in SYCL is achieved. What this means is that we can now run our SYCL-based applications across all major vendors’ platforms using the same source code, without modification, and attain performance comparable to the native application — a very promising result.
In conclusion, the oneAPI specification technical advisory board (TAB) is thrilled to announce that the Lawrence Berkeley National Laboratory (https://www.lbl.gov/) Physics Division has recently enabled CUDA support for random number generation in oneMKL. This is a new and significant community contribution to the oneMKL interfaces project. More to come!
I’m a Postdoctoral Research Scholar in the Physics Division at Lawrence Berkeley National Laboratory, and member of the Department of Energy’s High Energy Physics Center for Computational Excellence. My main R&D activities focus on heterogeneous computing and performance portability.
I have been investigating primarily the use of SYCL as a grand unified “API of everything” to address the “many hardware architectures and platforms” problem. With Codeplay having introduced support for CUDA devices, it became possible to compile and execute SYCL-based codes on Nvidia GPUs. However, we still needed to tap into vendor-specific libraries for our applications and benchmarking.