[llvm-dev] Looking for suggestions: Inferring GPU memory accesses
Ees via llvm-dev
llvm-dev at lists.llvm.org
Sun Aug 23 12:33:12 PDT 2020
@Madhur Thank you i will have a look at the paper.
> Doing such analysis would be useful for a thread block and not just a
Do you have any concrete use cases in mind?
I was thinking that i could use such an analysis to, for instance,
visualize the memory accesses performed by the kernel (or at least the
ones that it is possible to infer). Relevant literature i find always
involves tracing every access. So I'm thinking that with something like
this, tracing can be (potentially) significantly reduced.
On 23-08-2020 19:43, Madhur Amilkanthwar wrote:
> Oh, I see what you mean now. Doing such analysis would be useful for a
> thread block and not just a single thread but as you say you are onto
> something bigger than just a thread.
> We had published a short paper in ICS around this which uses
> polyhedral techniques to do such analysis and reason about uncoalesced
> access patterns in Cuda programs. You can find paper at
> On Sun, Aug 23, 2020, 11:00 PM Johannes Doerfert via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> Hi Ees,
> a while back we started a project with similar scope.
> Unfortunately the development slowed down and the plans to revive it
> this summer got tanked by the US travel restrictions.
> Anyway, there is some some existing code that might be useful,
> though in
> a prototype stage. While I'm obviously biased, I would suggest we
> continue from there.
> @Alex @Holger can we put the latest version on github or some other
> place to share it, I'm unsure if the code I (might have) access to is
> the latest.
> @Ees I attached a recent paper and you might find the following links
> * 2017 LLVM Developers’ Meeting: J. Doerfert “Polyhedral Value &
> Memory Analysis ” https://youtu.be/xSA0XLYJ-G0
> * "Automated Partitioning of Data-Parallel Kernels using
> Compilation.", P2S2 2020 (slides and video
> Let us know what you think :)
> ~ Johannes
> On 8/22/20 9:38 AM, Ees Kee via llvm-dev wrote:
> > Hi all,
> > As part of my research I want to investigate the relation
> between the
> > grid's geometry and the memory accesses of a kernel in common gpu
> > benchmarks (e.g Rodinia, Polybench etc). As a first step i want to
> > answer the following question:
> > - Given a kernel function with M possible memory accesses. For how
> many of
> > those M accesses we can statically infer its location given
> > for the grid/block and executing thread?
> > (Assume CUDA only for now)
> > My initial idea is to replace all uses of dim-related values, e.g:
> > __cuda_builtin_blockDim_t::__fetch_builtin_x()
> > __cuda_builtin_gridDim_t::__fetch_builtin_x()
> > and index related values, e.g:
> > __cuda_builtin_blockIdx_t::__fetch_builtin_x()
> > __cuda_builtin_threadIdx_t::__fetch_builtin_x()
> > with ConstantInts. Then run constant folding on the result and
> check how
> > many GEPs have constant values.
> > Would something like this work or are there complications I am not
> > of? I'd appreciate any suggestions.
> > P.S i am new to LLVM
> > Thanks in advance,
> > Ees
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev