[llvm-dev] [RFC] Upstreaming PACXX (Programing Accelerators with C++)

Sun Feb 4 23:11:29 PST 2018

HI LLVM comunity, 

after 3 years of development, various talks on LLVM-HPC and EuroLLVM and other scientific conferences I want to present my PhD research topic to the lists. 

The main goal for my research was to develop a single-source programming model equal to CUDA or SYCL for accelerators supported by LLVM (e.g., Nvidia GPUs). PACXX uses Clang as front-end for code generation and comes with a runtime library (PACXX-RT) to execute kernels on the available hardware. Currently, PACXX supports Nvidia GPUs through the NVPTX Target and CUDA, CPUs through MCJIT (including whole function vectorization thanks to RV [1]) and has an experimental back-end for AMD GPUs using the AMDGPU Target and ROCm. 

The main idea behind PACXX is the use of the LLVM IR as kernel code representation which is integrated into the executable together with the PACXX-RT. At runtime of the program the PACXX-RT compiles the IR to the final MC level and hands it over to the device. Since, PACXX does currently not enforce any major restrictions on the C++ code we managed to run (almost) arbitrary C++ code on GPUs including range-v3 [2, 3]. 

A short vector addition example using PACXX: 

using namespace pacxx::v2;
int main(int argc, char *argv[]) {
   // get the default executor 
   auto &exec = Executor::get();
    size_t size = 128;
    std::vector<int> a(size, 1);
    std::vector<int> b(size, 2);
    std::vector<int> c(size, 0);

    // allocate device side memory
    auto &da = exec.allocate<int>(a.size());
    auto &db = exec.allocate<int>(b.size());
    auto &dc = exec.allocate<int>(c.size());
    // copy data to the accelerator
    da.upload(a);
    db.upload(b);
    dc.upload(c);
    // get the raw pointer
    auto pa = da.get();
    auto pb = db.get();
    auto pc = dc.get();

    // define the computation
    auto vadd = [=](auto &config) {
      auto i = config.get_global(0);
      if (i < size)
       pc[i] = pa[i] + pb[i];
    };

    // launch and synchronize 
    std::promise<void> promise;
    auto future = exec.launch(vadd, {{1}, {128}}, promise);
    future.wait();
    // copy back the data
    dc.download(c);
}

Recently, I open sourced PACXX on github [3] under the same license LLVM is currently using. 
Since my PhD is now in its final stage I wanted to ask if there is interest in having such an SPMD programming model upstreamed.  
PACXX is currently on par with release_60 and only requires minor modifications to Clang, e.g., a command line switch, C++ attributes, some diagnostics and metadata generation during code gen. 
The PACXX-RT can be integrated into the LLVM build system and may remain a standalone project. (BTW, may I ask to add PACXX to the LLVM projects?).

Looking forward for your feedback. 

Cheers, 
Michael Haidl

[1] https://github.com/cdl-saarland/rv
[2] https://github.com/ericniebler/range-v3
[3] https://dl.acm.org/authorize?N20051
[4] https://github.com/pacxx/pacxx-llvm