[PATCH] D32431: [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen
Philipp Schaad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 2 13:13:18 PDT 2017
PhilippSchaad marked 17 inline comments as done.
PhilippSchaad added inline comments.
================
Comment at: include/polly/LinkAllPasses.h:51
#ifdef GPU_CODEGEN
-llvm::Pass *createPPCGCodeGenerationPass();
+llvm::Pass *createPPCGCodeGenerationPass(int Runtime);
#endif
----------------
Meinersbur wrote:
> etherzhhb wrote:
> > PhilippSchaad wrote:
> > > etherzhhb wrote:
> > > > is this Runtime supposed to be with type GPURuntimeT ? it is a little bit tricky here.
> > > > Maybe we need to introduce a PPCG header and define the runtime enum there, than include that runtime enum.
> > > >
> > > > or we can declare the function as
> > > >
> > > >
> > > > ```
> > > > llvm::Pass *createPPCGCodeGenerationPass(int Runtime = 0);
> > > > ```
> > > >
> > > > to at least avoid the magic number 0 in line 86.
> > > >
> > > >
> > > Yes, it would be. The reason it's not is exactly the one you mentioned. I was considering adding a PPCG header, but refrained from it because I was hesitant about creating a header 'just for one enum'. If you agree that this is a good solution, I will indeed introduce a new header for PPCG and define the enum there, to get rid of magic numbers. The second option seems reasonable too though.
> > we could start from the second option if you think it is reasonable
> I assume you kept the arguments of type `int` to not include header files here.
That is correct, yes.
================
Comment at: lib/Support/RegisterPasses.cpp:331-348
+ int Arch;
+ int Runtime;
+
+ static const int UseOpenCLRuntime = 2;
+ Runtime = UseOpenCLRuntime;
+
+ switch (GPUArch) {
----------------
Meinersbur wrote:
> ```
> int Arch;
> switch (GPUArch) {
> case GPU_ARCH_NVPTX64;
> Arch = 1;
> break;
> }
>
> int Runtime;
> switch (GPURuntime) {
> case GPU_RUNTIME_CUDA:
> Runtime = 1;
> break;
> case GPU_RUNTIME_OPENCL:
> Runtime = 2;
> breal
> }
>
> PM.add(polly::createPPCGCodeGenerationPass(Arch, Runtime))
> ```
>
> With "you don't need a switch anymore", I was thinking about the like of:
> ```
> PM.add(polly::createPPCGCodeGenerationPass(1, GPURuntime + 1));
> ```
> Your choice.
>
> It could be helpful to have `createPPCGCodeGenerationPass` accept the GPURuntime enum as arguments instead.
>
> In your solution, I don't see the use of `static const int` local variables. If you want identifiers that give names to the accepted arguments, declare them as `#define` or `static const int` in the header file that also declares `createPPCGCodeGenerationPass`, so these can be used in the implementation of `createPPCGCodeGenerationPass` as well.
> With "you don't need a switch anymore", I was thinking about the like of:
```
PM.add(polly::createPPCGCodeGenerationPass(1, GPURuntime + 1));
```
> Your choice.
Personally, I think the current solution is tiny little bit more 'documenting'. Both is fine though, good call.
> It could be helpful to have createPPCGCodeGenerationPass accept the GPURuntime enum as arguments instead.
That is true, would mean having to provide a header with that GPURuntime enum instead though, right?
> In your solution, I don't see the use of static const int local variables. If you want identifiers that give names to the accepted arguments, declare them as #define or static const int in the header file that also declares createPPCGCodeGenerationPass, so these can be used in the implementation of createPPCGCodeGenerationPass as well.
Would it maybe make sense to introduce a PPCGCodeGeneration header at this point?
================
Comment at: tools/GPURuntime/GPUJIT.c:610-618
+ Ret = clSetKernelArgFcnPtr(CLKernel->Kernel, i, 8, (void *)Parameters[i]);
+ if (Ret == CL_INVALID_ARG_SIZE) {
+ Ret = clSetKernelArgFcnPtr(CLKernel->Kernel, i, 4, (void *)Parameters[i]);
+ if (Ret == CL_INVALID_ARG_SIZE) {
+ Ret =
+ clSetKernelArgFcnPtr(CLKernel->Kernel, i, 2, (void *)Parameters[i]);
+ if (Ret == CL_INVALID_ARG_SIZE) {
----------------
Meinersbur wrote:
> Trying each argument size after the other and hoping one matches is not good. The caller must know the argument sizes. You probably have to pass the sizes in another argument to `launchKernelCL` that contains those sizes for each argument, generated by Polly.
>
> Without this, the code will fail if you pass a struct (or vector) of size other than 8, 4, 2, or 1.
Yes, this is a priority issue still. The issue will have to be resolved at some point. This is basically a temporary way around some (probably) major argument handling changes in PPCG etc.
https://reviews.llvm.org/D32431
More information about the llvm-commits
mailing list