<div dir="ltr">As Stefanos mentioned, LLVM-IR is generally too low-level for this. C/C++ semantics also don't make it easier due to possible aliasing.<div><br></div><div>The 3 cases are very different to analyze.</div><div><br></div><div>1. A[..][..] is syntactically invalid. A is a single pointer.</div><div>2. *B[2] Is not equivalent to 


B[0][2], but to B[2][0]. This jagged/indirect/pointer-chasing arrays accesses are expressed as chains of getelementptr and load instructions. However, they cannot be usefully optimized because pointers on each level can actually point to the same thing. Eg:</div><div><br></div><div>int data[] = { 41, 42, 43 };</div><div>B[0] = data;</div><div>B[1] = data;</div><div>B[2] = data;</div><div><br></div><div>therefore</div><div><br></div><div>B[0][1] == B[1][1] == B[2][1] == 42;</div><div><br></div><div>3. S.y[..] can be one or multiple getelementptr instructions and is the most analyzable since all indices are static at compile-time. The number of subscripts and their values can directly be taken from the getelementptr instruction(s). The pointers-to-int can still alias with other ints in the program.</div><div><br></div><div>4. These case of VLA or manually linearized array: </div><div><br></div><div>int *C = alloca(sizeof(int) * m * n);</div><div>C[x + m*y];</div><div><br></div><div>This is intended to have two dimensions, but appear as just one in the LLVM-IR. The following (C99 VLA) is compiled to approximately the same LLVM-IR</div><div><br></div><div>int C[n][m];</div><div>C[y][x];</div><div><br></div><div>Delinearization as mentioned by Stefanos tries to recover the two subscripts x and y, but can only do so heuristically. Also keep in mind that </div><div>C[1][-1] appears as the same IR as C[0][m-1], so there is no unique way to delinerarize. In particular, one cannot just assume that if all indices are different, that the memory locations being accessed are different (again, a pointer aliasing problem)</div><div><br></div><div>Michael</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Am Fr., 2. Okt. 2020 um 19:25 Uhr schrieb Stefanos Baziotis via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Hi Ees,<div dir="auto"><br></div><div dir="auto">SCEV Delinearization is the closest I know. But it has its problems. Well for one your expression should be SCEVable.</div><div dir="auto"><br></div><div dir="auto">But more importantly, SCEV Delinearization is trying to deduce something that is high-level (actually source-level) from a low-level IR in which a lot of this info has been lost. So, since there's not a 1-1 mapping from high-level code to LLVM IR, going backwards will always be imperfect.</div><div dir="auto"><br></div><div dir="auto">And so since you are too trying to deduce a high-level thing, I believe that any solution will be imperfect.</div><div dir="auto"><br></div><div dir="auto">Best,</div><div dir="auto">Stefanos</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Oct 3, 2020, 02:55 Ees via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" rel="noreferrer noreferrer noreferrer" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Anyone? I'd really appreciate any hints to look up as i'm somewhat stuck <br>

with this.<br>

<br>

Cheers.<br>

<br>

On 23-09-2020 12:27, Ees wrote:<br>

> Hi all,<br>

><br>

> For loads and stores i want to extract information about the number of <br>

> indices accessed. For instance:<br>

><br>

> struct S {int X, int *Y};<br>

><br>

> __global__ void kernel(int *A, int **B, struct S) {<br>

>   int x = A[..][..]; // -> L: A[..][..]<br>

>   int y = *B[2];   // -> L: B[0][2]<br>

>   int z = S.y[..];  // -> L: S.1[..]<br>

><br>

>   // etc..<br>

> }<br>

><br>

> I am performing some preprocessing on IR to:<br>

> 1. Move constant inline GEPs into instructions<br>

> 2. For loads and stores without a GEP operand, explicitly create a <br>

> (trivial) GEP with index 0<br>

><br>

> So now the operand of every load and store is a GEP instruction.<br>

><br>

> For simple stuff i am getting the right answer but when the index <br>

> expression becomes more complex multiple GEPs are introduced. For <br>

> instance:<br>

><br>

> *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5;<br>

><br>

> produces:<br>

><br>

>   %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()<br>

>   %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()<br>

>   %8 = mul i32 %6, %7,<br>

>   %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()<br>

>   %10 = add i32 %8, %9,<br>

>   %11 = add i32 %10, 1,<br>

>   %12 = mul i32 2, %11,<br>

>   %13 = zext i32 %12 to i64,<br>

>   %14 = getelementptr inbounds i32, i32* %0, i64 %13<br>

>   %15 = getelementptr inbounds i32, i32* %14, i64 2<br>

>   %16 = getelementptr inbounds i32, i32* %15, i64 3<br>

>   store i32 5, i32* %16, align 4,<br>

><br>

> So i guess relying on the number of GEPs to figure the number of <br>

> indices is only a heuristic. Is there a more robust way to go on about <br>

> it? Or some example i can look at?<br>

><br>

> PS: I'm only interested about CUDA kernels.<br>

><br>

> Ees<br>

><br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" rel="noreferrer noreferrer noreferrer noreferrer" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer noreferrer noreferrer noreferrer noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>