[llvm-dev] Looking for suggestions: Inferring GPU memory accesses

Sun Aug 23 12:24:13 PDT 2020

Hello Johannes,

Thank you very much for the material. I will have a look and get back to 
you (possibly with questions if you don't mind :) ).
I would also appreciate the code if that's available.

- Ees

On 23-08-2020 18:47, Johannes Doerfert wrote:
> Hi Ees,
>
> a while back we started a project with similar scope.
> Unfortunately the development slowed down and the plans to revive it 
> this summer got tanked by the US travel restrictions.
>
> Anyway, there is some some existing code that might be useful, though 
> in a prototype stage. While I'm obviously biased, I would suggest we 
> continue from there.
>
> @Alex @Holger can we put the latest version on github or some other 
> place to share it, I'm unsure if the code I (might have) access to is 
> the latest.
>
> @Ees I attached a recent paper and you might find the following links 
> useful:
>
>    * 2017 LLVM Developers’ Meeting: J. Doerfert “Polyhedral Value & 
> Memory Analysis ” https://youtu.be/xSA0XLYJ-G0
>
>    * "Automated Partitioning of Data-Parallel Kernels using Polyhedral 
> Compilation.", P2S2 2020 (slides and video 
> https://www.mcs.anl.gov/events/workshops/p2s2/2020/program.php)
>
>
> Let us know what you think :)
>
> ~ Johannes
>
>
>
>
> On 8/22/20 9:38 AM, Ees Kee via llvm-dev wrote:
> > Hi all,
> >
> > As part of my research I want to investigate the relation between the
> > grid's geometry and the memory accesses of a kernel in common gpu
> > benchmarks (e.g Rodinia, Polybench etc). As a first step i want to
> > answer the following question:
> >
> > - Given a kernel function with M possible memory accesses. For how 
> many of
> > those M accesses we can statically infer its location given concrete 
> values
> > for the grid/block and executing thread?
> >
> > (Assume CUDA only for now)
> >
> > My initial idea is to replace all uses of dim-related values, e.g:
> >     __cuda_builtin_blockDim_t::__fetch_builtin_x()
> >     __cuda_builtin_gridDim_t::__fetch_builtin_x()
> >
> > and index related values, e.g:
> >     __cuda_builtin_blockIdx_t::__fetch_builtin_x()
> >     __cuda_builtin_threadIdx_t::__fetch_builtin_x()
> >
> > with ConstantInts. Then run constant folding on the result and check 
> how
> > many GEPs have constant values.
> >
> > Would something like this work or are there complications I am not 
> thinking
> > of? I'd appreciate any suggestions.
> >
> > P.S i am new to LLVM
> >
> > Thanks in advance,
> > Ees
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>