[llvm-dev] Information about the number of indices in memory accesses
Ees via llvm-dev
llvm-dev at lists.llvm.org
Sat Oct 3 05:26:55 PDT 2020
Stefanos, Michael, thank you for your responses. I've been toying with
this for some days now and i kind of figured/feared this might not be
possible, i just thought, that maybe there is something. I'm new to LLVM
so i'm not aware of most things. In any case i've decided to take
another direction with this.
May i ask one more thing? Say i want for each load/store to get a name
for the base pointer. For instance the "A", "B", "S.1" in the example
above.
In this example:
%14 = getelementptr inbounds i32, i32* %0, i64 %13
%15 = getelementptr inbounds i32, i32* %14, i64 2
%16 = getelementptr inbounds i32, i32* %15, i64 3
store i32 5, i32* %16, align 4,
I can follow the GEP chain until i hit an argument/global/alloca. But
what if there are temporaries in between them? Is there a more
generic/systematic way to go on about it?
Maybe some project you are aware of that does something similar to have
a look at?
Cheers,
Ees
On 03-10-2020 12:43, Stefanos Baziotis wrote:
> Michael makes a great point about aliasing here and different indexing
> that accesses the same element!
>
> Another note: x = A[0][2] is fundamentally different depending on the
> type of `A`. If e.g. A was declared: int A[10][20], there's only _one_
> load. A is a (and is treated as) a linear buffer,
> and GEPs only pinpoint the specific position of A[0][2] in this buffer
> (i.e. 0*10 + 2). But if A was e.g. this: int **A, there _two_ loads.
> One load to get the "pointer of the zeroth row" and another load to
> get the 2nd element off of that row.
>
> So, you see, all these things make any deduction method very very
> imprecise.
>
> Best,
> Stefanos
>
> Στις Σάβ, 3 Οκτ 2020 στις 5:13 π.μ., ο/η Michael Kruse
> <llvmdev at meinersbur.de <mailto:llvmdev at meinersbur.de>> έγραψε:
>
> As Stefanos mentioned, LLVM-IR is generally too low-level for
> this. C/C++ semantics also don't make it easier due to possible
> aliasing.
>
> The 3 cases are very different to analyze.
>
> 1. A[..][..] is syntactically invalid. A is a single pointer.
> 2. *B[2] Is not equivalent to B[0][2], but to B[2][0]. This
> jagged/indirect/pointer-chasing arrays accesses are expressed as
> chains of getelementptr and load instructions. However, they
> cannot be usefully optimized because pointers on each level can
> actually point to the same thing. Eg:
>
> int data[] = { 41, 42, 43 };
> B[0] = data;
> B[1] = data;
> B[2] = data;
>
> therefore
>
> B[0][1] == B[1][1] == B[2][1] == 42;
>
> 3. S.y[..] can be one or multiple getelementptr instructions and
> is the most analyzable since all indices are static at
> compile-time. The number of subscripts and their values can
> directly be taken from the getelementptr instruction(s). The
> pointers-to-int can still alias with other ints in the program.
>
> 4. These case of VLA or manually linearized array:
>
> int *C = alloca(sizeof(int) * m * n);
> C[x + m*y];
>
> This is intended to have two dimensions, but appear as just one in
> the LLVM-IR. The following (C99 VLA) is compiled to approximately
> the same LLVM-IR
>
> int C[n][m];
> C[y][x];
>
> Delinearization as mentioned by Stefanos tries to recover the two
> subscripts x and y, but can only do so heuristically. Also keep in
> mind that
> C[1][-1] appears as the same IR as C[0][m-1], so there is no
> unique way to delinerarize. In particular, one cannot just assume
> that if all indices are different, that the memory locations being
> accessed are different (again, a pointer aliasing problem)
>
> Michael
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Am Fr., 2. Okt. 2020 um 19:25 Uhr schrieb Stefanos Baziotis via
> llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>:
>
> Hi Ees,
>
> SCEV Delinearization is the closest I know. But it has its
> problems. Well for one your expression should be SCEVable.
>
> But more importantly, SCEV Delinearization is trying to deduce
> something that is high-level (actually source-level) from a
> low-level IR in which a lot of this info has been lost. So,
> since there's not a 1-1 mapping from high-level code to LLVM
> IR, going backwards will always be imperfect.
>
> And so since you are too trying to deduce a high-level thing,
> I believe that any solution will be imperfect.
>
> Best,
> Stefanos
>
> On Sat, Oct 3, 2020, 02:55 Ees via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> Anyone? I'd really appreciate any hints to look up as i'm
> somewhat stuck
> with this.
>
> Cheers.
>
> On 23-09-2020 12:27, Ees wrote:
> > Hi all,
> >
> > For loads and stores i want to extract information about
> the number of
> > indices accessed. For instance:
> >
> > struct S {int X, int *Y};
> >
> > __global__ void kernel(int *A, int **B, struct S) {
> > int x = A[..][..]; // -> L: A[..][..]
> > int y = *B[2]; // -> L: B[0][2]
> > int z = S.y[..]; // -> L: S.1[..]
> >
> > // etc..
> > }
> >
> > I am performing some preprocessing on IR to:
> > 1. Move constant inline GEPs into instructions
> > 2. For loads and stores without a GEP operand,
> explicitly create a
> > (trivial) GEP with index 0
> >
> > So now the operand of every load and store is a GEP
> instruction.
> >
> > For simple stuff i am getting the right answer but when
> the index
> > expression becomes more complex multiple GEPs are
> introduced. For
> > instance:
> >
> > *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5;
> >
> > produces:
> >
> > %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
> > %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
> > %8 = mul i32 %6, %7,
> > %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
> > %10 = add i32 %8, %9,
> > %11 = add i32 %10, 1,
> > %12 = mul i32 2, %11,
> > %13 = zext i32 %12 to i64,
> > %14 = getelementptr inbounds i32, i32* %0, i64 %13
> > %15 = getelementptr inbounds i32, i32* %14, i64 2
> > %16 = getelementptr inbounds i32, i32* %15, i64 3
> > store i32 5, i32* %16, align 4,
> >
> > So i guess relying on the number of GEPs to figure the
> number of
> > indices is only a heuristic. Is there a more robust way
> to go on about
> > it? Or some example i can look at?
> >
> > PS: I'm only interested about CUDA kernels.
> >
> > Ees
> >
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201003/a922d9c9/attachment-0001.html>
More information about the llvm-dev
mailing list