[llvm-dev] Information about the number of indices in memory accesses

Sat Oct 3 05:26:55 PDT 2020

Stefanos, Michael, thank you for your responses. I've been toying with 
this for some days now and i kind of figured/feared this might not be 
possible, i just thought, that maybe there is something. I'm new to LLVM 
so i'm not aware of most things. In any case i've decided to take 
another direction with this.

May i ask one more thing? Say i want for each load/store to get a name 
for the base pointer. For instance the "A", "B", "S.1" in the example 
above.

In this example:

   %14 = getelementptr inbounds i32, i32* %0, i64 %13
   %15 = getelementptr inbounds i32, i32* %14, i64 2
   %16 = getelementptr inbounds i32, i32* %15, i64 3
   store i32 5, i32* %16, align 4,

I can follow the GEP chain until i hit an argument/global/alloca. But 
what if there are temporaries in between them? Is there a more 
generic/systematic way to go on about it?
Maybe some project you are aware of that does something similar to have 
a look at?

Cheers,
Ees

On 03-10-2020 12:43, Stefanos Baziotis wrote:
> Michael makes a great point about aliasing here and different indexing 
> that accesses the same element!
>
> Another note: x = A[0][2] is fundamentally different depending on the 
> type of `A`. If e.g. A was declared: int A[10][20], there's only _one_ 
> load. A is a (and is treated as) a linear buffer,
> and GEPs only pinpoint the specific position of A[0][2] in this buffer 
> (i.e. 0*10 + 2). But if A was e.g. this: int **A, there _two_ loads. 
> One load to get the "pointer of the zeroth row" and another load to 
> get the 2nd element off of that row.
>
> So, you see, all these things make any deduction method very very 
> imprecise.
>
> Best,
> Stefanos
>
> Στις Σάβ, 3 Οκτ 2020 στις 5:13 π.μ., ο/η Michael Kruse 
> <llvmdev at meinersbur.de <mailto:llvmdev at meinersbur.de>> έγραψε:
>
>     As Stefanos mentioned, LLVM-IR is generally too low-level for
>     this. C/C++ semantics also don't make it easier due to possible
>     aliasing.
>
>     The 3 cases are very different to analyze.
>
>     1. A[..][..] is syntactically invalid. A is a single pointer.
>     2. *B[2] Is not equivalent to B[0][2], but to B[2][0]. This
>     jagged/indirect/pointer-chasing arrays accesses are expressed as
>     chains of getelementptr and load instructions. However, they
>     cannot be usefully optimized because pointers on each level can
>     actually point to the same thing. Eg:
>
>     int data[] = { 41, 42, 43 };
>     B[0] = data;
>     B[1] = data;
>     B[2] = data;
>
>     therefore
>
>     B[0][1] == B[1][1] == B[2][1] == 42;
>
>     3. S.y[..] can be one or multiple getelementptr instructions and
>     is the most analyzable since all indices are static at
>     compile-time. The number of subscripts and their values can
>     directly be taken from the getelementptr instruction(s). The
>     pointers-to-int can still alias with other ints in the program.
>
>     4. These case of VLA or manually linearized array:
>
>     int *C = alloca(sizeof(int) * m * n);
>     C[x + m*y];
>
>     This is intended to have two dimensions, but appear as just one in
>     the LLVM-IR. The following (C99 VLA) is compiled to approximately
>     the same LLVM-IR
>
>     int C[n][m];
>     C[y][x];
>
>     Delinearization as mentioned by Stefanos tries to recover the two
>     subscripts x and y, but can only do so heuristically. Also keep in
>     mind that
>     C[1][-1] appears as the same IR as C[0][m-1], so there is no
>     unique way to delinerarize. In particular, one cannot just assume
>     that if all indices are different, that the memory locations being
>     accessed are different (again, a pointer aliasing problem)
>
>     Michael
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Am Fr., 2. Okt. 2020 um 19:25 Uhr schrieb Stefanos Baziotis via
>     llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>:
>
>         Hi Ees,
>
>         SCEV Delinearization is the closest I know. But it has its
>         problems. Well for one your expression should be SCEVable.
>
>         But more importantly, SCEV Delinearization is trying to deduce
>         something that is high-level (actually source-level) from a
>         low-level IR in which a lot of this info has been lost. So,
>         since there's not a 1-1 mapping from high-level code to LLVM
>         IR, going backwards will always be imperfect.
>
>         And so since you are too trying to deduce a high-level thing,
>         I believe that any solution will be imperfect.
>
>         Best,
>         Stefanos
>
>         On Sat, Oct 3, 2020, 02:55 Ees via llvm-dev
>         <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>             Anyone? I'd really appreciate any hints to look up as i'm
>             somewhat stuck
>             with this.
>
>             Cheers.
>
>             On 23-09-2020 12:27, Ees wrote:
>             > Hi all,
>             >
>             > For loads and stores i want to extract information about
>             the number of
>             > indices accessed. For instance:
>             >
>             > struct S {int X, int *Y};
>             >
>             > __global__ void kernel(int *A, int **B, struct S) {
>             >   int x = A[..][..]; // -> L: A[..][..]
>             >   int y = *B[2];   // -> L: B[0][2]
>             >   int z = S.y[..];  // -> L: S.1[..]
>             >
>             >   // etc..
>             > }
>             >
>             > I am performing some preprocessing on IR to:
>             > 1. Move constant inline GEPs into instructions
>             > 2. For loads and stores without a GEP operand,
>             explicitly create a
>             > (trivial) GEP with index 0
>             >
>             > So now the operand of every load and store is a GEP
>             instruction.
>             >
>             > For simple stuff i am getting the right answer but when
>             the index
>             > expression becomes more complex multiple GEPs are
>             introduced. For
>             > instance:
>             >
>             > *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5;
>             >
>             > produces:
>             >
>             >   %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
>             >   %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
>             >   %8 = mul i32 %6, %7,
>             >   %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
>             >   %10 = add i32 %8, %9,
>             >   %11 = add i32 %10, 1,
>             >   %12 = mul i32 2, %11,
>             >   %13 = zext i32 %12 to i64,
>             >   %14 = getelementptr inbounds i32, i32* %0, i64 %13
>             >   %15 = getelementptr inbounds i32, i32* %14, i64 2
>             >   %16 = getelementptr inbounds i32, i32* %15, i64 3
>             >   store i32 5, i32* %16, align 4,
>             >
>             > So i guess relying on the number of GEPs to figure the
>             number of
>             > indices is only a heuristic. Is there a more robust way
>             to go on about
>             > it? Or some example i can look at?
>             >
>             > PS: I'm only interested about CUDA kernels.
>             >
>             > Ees
>             >
>             _______________________________________________
>             LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>         _______________________________________________
>         LLVM Developers mailing list
>         llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>         https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201003/a922d9c9/attachment-0001.html>