[PATCH] D32431: [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen

Thu Apr 27 07:49:25 PDT 2017

PhilippSchaad added inline comments.

================
Comment at: lib/CodeGen/PPCGCodeGeneration.cpp:1562
+  else if (GPUNodeBuilder::Runtime == GPU_RUNTIME_OPENCL)
+    GPUModule->setTargetTriple(Triple::normalize("nvptx64-nvidia-nvcl"));
+
----------------
Meinersbur wrote:
> grosser wrote:
> > PhilippSchaad wrote:
> > > Meinersbur wrote:
> > > > etherzhhb wrote:
> > > > > PhilippSchaad wrote:
> > > > > > Meinersbur wrote:
> > > > > > > Is there some vendor-neutral triple?
> > > > > > Do you mean like `nvptx64-nvcl` / `nvptx64-cuda`?
> > > > > for opencl, it can be "spir-unknown-unknown" or "spir64-unknown-unknown", but that may not work :)
> > > > I hoped that there might be some kind of triple that works for OpenCL in general, not only for nvidia (`nvptx`, `nvcl`). If the generated program only works for devices that support cuda anyway, I don't see where the benefit of such a backend is.
> > > > 
> > > > If there is indeed no backend that also works on non-nvidia devices, should we call the the runtime accordingly, e.g. "nvcl" then?
> > > Looking into it. The next goal would be to add the AMDGPU backend to generate AMD ISA, which would then again utilize the same OpenCL Runtime implemented here. (I realize there will have to be some naming changes to make that clear in the `GPUJIT`, but as you pointed out, I have a naming-mess to fix there anyway.
> > Making OpenCL work for CUDA is just the first step. I expect that when adding AMDGPU support, we will use here different triples depending on which vendor to target. AMD will have a specific one, CUDA will have a specific one, and for Intel we likely use the generic SPIR-V comment. I assume this could then also work for Xilinx.
> At compile time, we don't know on which hardware it will run on, so we cannot specify a triple here. 
> 
> Unless you think of a runtime dispatch system, then you need to generate all kernels at once. In that case, I still would like to select a single target only for when I know I will run only on that hardware and to keep the executable small.
I thought the goal was to let the user compile for a specific target, i.e. providing something like -polly-gpu-arch=amd/nvidia/intel, and then choosing the correct target triple according to said selection. Meaning for example -polly-gpu-arch=amd would utilize the AMDGPU backend triple and feed that into the OpenCL runtime.

Am I misunderstanding something?

https://reviews.llvm.org/D32431