Software vendors distributing SYCL binaries to end users with unknown hardware configurations commonly wish for a single binary to be able to run on any hardware.
In current SYCL implementations such as DPC++ or hipSYCL, constructing such “universal” binaries can be an arduous task: Their multipass compilation model requires parsing the code separately for each targeted backend, and once per target device in the case of AMD hardware. This can inflate compile times to impractical levels.
We will discuss hipSYCL’s new generic single-pass compiler, which can generate a universal binary while parsing the source code only a single time. With runtime performance levels comparable to current compilers, the new compiler can substantially outperform other SYCL compilers in terms of compile time, and provides instant binary portability across NVIDIA, AMD and Intel GPUs – including Intel Data Center GPU Max Series. To this end, we also present the very first hipSYCL performance numbers on Intel Data Center GPU Max Series.