[llvm-dev] Information about the number of indices in memory accesses

Stefanos Baziotis via llvm-dev llvm-dev at lists.llvm.org
Sat Oct 3 05:57:24 PDT 2020


Hi Ees,

You may want to have a look at GetUnderlyingObjects (
https://llvm.org/doxygen/namespacellvm.html#ad3428471506e6c03e9395a697a897a83
).
The page takes some time to load. Alternatively, you can see the
implementation at ValueTracking.cpp (
https://llvm.org/doxygen/ValueTracking_8cpp_source.html)
and use it by including `llvm/Analysis/ValueTracking.h`.

Note that you probably don't want to call this frequently.

Best,
Stefanos

Στις Σάβ, 3 Οκτ 2020 στις 3:26 μ.μ., ο/η Ees <kayesg42 at gmail.com> έγραψε:

> Stefanos, Michael, thank you for your responses. I've been toying with
> this for some days now and i kind of figured/feared this might not be
> possible, i just thought, that maybe there is something. I'm new to LLVM so
> i'm not aware of most things. In any case i've decided to take another
> direction with this.
>
> May i ask one more thing? Say i want for each load/store to get a name for
> the base pointer. For instance the "A", "B", "S.1" in the example above.
>
> In this example:
>
>   %14 = getelementptr inbounds i32, i32* %0, i64 %13
>   %15 = getelementptr inbounds i32, i32* %14, i64 2
>   %16 = getelementptr inbounds i32, i32* %15, i64 3
>   store i32 5, i32* %16, align 4,
>
> I can follow the GEP chain until i hit an argument/global/alloca. But what
> if there are temporaries in between them? Is there a more
> generic/systematic way to go on about it?
> Maybe some project you are aware of that does something similar to have a
> look at?
>
> Cheers,
> Ees
> On 03-10-2020 12:43, Stefanos Baziotis wrote:
>
> Michael makes a great point about aliasing here and different indexing
> that accesses the same element!
>
> Another note: x = A[0][2] is fundamentally different depending on the type
> of `A`. If e.g. A was declared: int A[10][20], there's only _one_ load. A
> is a (and is treated as) a linear buffer,
> and GEPs only pinpoint the specific position of A[0][2] in this buffer
> (i.e. 0*10 + 2). But if A was e.g. this: int **A, there _two_ loads. One
> load to get the "pointer of the zeroth row" and another load to get the
> 2nd element off of that row.
>
> So, you see, all these things make any deduction method very very
> imprecise.
>
> Best,
> Stefanos
>
> Στις Σάβ, 3 Οκτ 2020 στις 5:13 π.μ., ο/η Michael Kruse <
> llvmdev at meinersbur.de> έγραψε:
>
>> As Stefanos mentioned, LLVM-IR is generally too low-level for this. C/C++
>> semantics also don't make it easier due to possible aliasing.
>>
>> The 3 cases are very different to analyze.
>>
>> 1. A[..][..] is syntactically invalid. A is a single pointer.
>> 2. *B[2] Is not equivalent to  B[0][2], but to B[2][0]. This
>> jagged/indirect/pointer-chasing arrays accesses are expressed as chains of
>> getelementptr and load instructions. However, they cannot be usefully
>> optimized because pointers on each level can actually point to the same
>> thing. Eg:
>>
>> int data[] = { 41, 42, 43 };
>> B[0] = data;
>> B[1] = data;
>> B[2] = data;
>>
>> therefore
>>
>> B[0][1] == B[1][1] == B[2][1] == 42;
>>
>> 3. S.y[..] can be one or multiple getelementptr instructions and is the
>> most analyzable since all indices are static at compile-time. The number of
>> subscripts and their values can directly be taken from the getelementptr
>> instruction(s). The pointers-to-int can still alias with other ints in the
>> program.
>>
>> 4. These case of VLA or manually linearized array:
>>
>> int *C = alloca(sizeof(int) * m * n);
>> C[x + m*y];
>>
>> This is intended to have two dimensions, but appear as just one in the
>> LLVM-IR. The following (C99 VLA) is compiled to approximately the same
>> LLVM-IR
>>
>> int C[n][m];
>> C[y][x];
>>
>> Delinearization as mentioned by Stefanos tries to recover the two
>> subscripts x and y, but can only do so heuristically. Also keep in mind
>> that
>> C[1][-1] appears as the same IR as C[0][m-1], so there is no unique way
>> to delinerarize. In particular, one cannot just assume that if all indices
>> are different, that the memory locations being accessed are different
>> (again, a pointer aliasing problem)
>>
>> Michael
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Am Fr., 2. Okt. 2020 um 19:25 Uhr schrieb Stefanos Baziotis via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> Hi Ees,
>>>
>>> SCEV Delinearization is the closest I know. But it has its problems.
>>> Well for one your expression should be SCEVable.
>>>
>>> But more importantly, SCEV Delinearization is trying to deduce something
>>> that is high-level (actually source-level) from a low-level IR in which a
>>> lot of this info has been lost. So, since there's not a 1-1 mapping from
>>> high-level code to LLVM IR, going backwards will always be imperfect.
>>>
>>> And so since you are too trying to deduce a high-level thing, I believe
>>> that any solution will be imperfect.
>>>
>>> Best,
>>> Stefanos
>>>
>>> On Sat, Oct 3, 2020, 02:55 Ees via llvm-dev <llvm-dev at lists.llvm.org>
>>> wrote:
>>>
>>>> Anyone? I'd really appreciate any hints to look up as i'm somewhat
>>>> stuck
>>>> with this.
>>>>
>>>> Cheers.
>>>>
>>>> On 23-09-2020 12:27, Ees wrote:
>>>> > Hi all,
>>>> >
>>>> > For loads and stores i want to extract information about the number
>>>> of
>>>> > indices accessed. For instance:
>>>> >
>>>> > struct S {int X, int *Y};
>>>> >
>>>> > __global__ void kernel(int *A, int **B, struct S) {
>>>> >   int x = A[..][..]; // -> L: A[..][..]
>>>> >   int y = *B[2];   // -> L: B[0][2]
>>>> >   int z = S.y[..];  // -> L: S.1[..]
>>>> >
>>>> >   // etc..
>>>> > }
>>>> >
>>>> > I am performing some preprocessing on IR to:
>>>> > 1. Move constant inline GEPs into instructions
>>>> > 2. For loads and stores without a GEP operand, explicitly create a
>>>> > (trivial) GEP with index 0
>>>> >
>>>> > So now the operand of every load and store is a GEP instruction.
>>>> >
>>>> > For simple stuff i am getting the right answer but when the index
>>>> > expression becomes more complex multiple GEPs are introduced. For
>>>> > instance:
>>>> >
>>>> > *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5;
>>>> >
>>>> > produces:
>>>> >
>>>> >   %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
>>>> >   %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
>>>> >   %8 = mul i32 %6, %7,
>>>> >   %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
>>>> >   %10 = add i32 %8, %9,
>>>> >   %11 = add i32 %10, 1,
>>>> >   %12 = mul i32 2, %11,
>>>> >   %13 = zext i32 %12 to i64,
>>>> >   %14 = getelementptr inbounds i32, i32* %0, i64 %13
>>>> >   %15 = getelementptr inbounds i32, i32* %14, i64 2
>>>> >   %16 = getelementptr inbounds i32, i32* %15, i64 3
>>>> >   store i32 5, i32* %16, align 4,
>>>> >
>>>> > So i guess relying on the number of GEPs to figure the number of
>>>> > indices is only a heuristic. Is there a more robust way to go on
>>>> about
>>>> > it? Or some example i can look at?
>>>> >
>>>> > PS: I'm only interested about CUDA kernels.
>>>> >
>>>> > Ees
>>>> >
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201003/4f4d9972/attachment.html>


More information about the llvm-dev mailing list