[llvm-dev] Information about the number of indices in memory accesses

Stefanos Baziotis via llvm-dev llvm-dev at lists.llvm.org
Fri Oct 2 17:24:59 PDT 2020


Hi Ees,

SCEV Delinearization is the closest I know. But it has its problems. Well
for one your expression should be SCEVable.

But more importantly, SCEV Delinearization is trying to deduce something
that is high-level (actually source-level) from a low-level IR in which a
lot of this info has been lost. So, since there's not a 1-1 mapping from
high-level code to LLVM IR, going backwards will always be imperfect.

And so since you are too trying to deduce a high-level thing, I believe
that any solution will be imperfect.

Best,
Stefanos

On Sat, Oct 3, 2020, 02:55 Ees via llvm-dev <llvm-dev at lists.llvm.org> wrote:

> Anyone? I'd really appreciate any hints to look up as i'm somewhat stuck
> with this.
>
> Cheers.
>
> On 23-09-2020 12:27, Ees wrote:
> > Hi all,
> >
> > For loads and stores i want to extract information about the number of
> > indices accessed. For instance:
> >
> > struct S {int X, int *Y};
> >
> > __global__ void kernel(int *A, int **B, struct S) {
> >   int x = A[..][..]; // -> L: A[..][..]
> >   int y = *B[2];   // -> L: B[0][2]
> >   int z = S.y[..];  // -> L: S.1[..]
> >
> >   // etc..
> > }
> >
> > I am performing some preprocessing on IR to:
> > 1. Move constant inline GEPs into instructions
> > 2. For loads and stores without a GEP operand, explicitly create a
> > (trivial) GEP with index 0
> >
> > So now the operand of every load and store is a GEP instruction.
> >
> > For simple stuff i am getting the right answer but when the index
> > expression becomes more complex multiple GEPs are introduced. For
> > instance:
> >
> > *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5;
> >
> > produces:
> >
> >   %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
> >   %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
> >   %8 = mul i32 %6, %7,
> >   %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
> >   %10 = add i32 %8, %9,
> >   %11 = add i32 %10, 1,
> >   %12 = mul i32 2, %11,
> >   %13 = zext i32 %12 to i64,
> >   %14 = getelementptr inbounds i32, i32* %0, i64 %13
> >   %15 = getelementptr inbounds i32, i32* %14, i64 2
> >   %16 = getelementptr inbounds i32, i32* %15, i64 3
> >   store i32 5, i32* %16, align 4,
> >
> > So i guess relying on the number of GEPs to figure the number of
> > indices is only a heuristic. Is there a more robust way to go on about
> > it? Or some example i can look at?
> >
> > PS: I'm only interested about CUDA kernels.
> >
> > Ees
> >
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201003/1c3576db/attachment.html>


More information about the llvm-dev mailing list