<div><div dir="auto">I’ve written a number of benchmarks related to multidimensional arrays (d=2 to d=6) in multiple languages (C99 and Fortran 77 in one case and 5+ languages in others). Would any of that be useful here, or is the value obvious enough without benchmarking?</div></div><div dir="auto"><br></div><div dir="auto">Jeff</div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jul 24, 2019 at 7:34 PM David Greene via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It's also very common in Fortran.<br>

<br>

                       -David<br>

<br>

Michael Ferguson via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> writes:<br>

<br>

>> It seems that the main advantage of your proposal is that it would<br>

>> allow for non-constant strides (i.e. variable length arrays) in<br>

>> dimensions other than the first one. Do these appear frequently<br>

>> enough in the programs that you're interested in to be worth<br>

>> optimizing for?<br>

><br>

> Yes - at least in Chapel (which is one of the motivating languages)<br>

> these are very common.<br>

> In other words, typical Chapel programs use arrays with a size that is<br>

> not fixed at compile time.<br>

><br>

> Thanks,<br>

><br>

> -michael<br>

><br>

> On Mon, Jul 22, 2019 at 1:02 PM Peter Collingbourne via llvm-dev<br>

> <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>

>><br>

>> The restrictions of this new intrinsic seem very similar to those of<br>

>> the inrange attribute on GEPs. It seems that the main advantage of<br>

>> your proposal is that it would allow for non-constant strides<br>

>> (i.e. variable length arrays) in dimensions other than the first<br>

>> one. Do these appear frequently enough in the programs that you're<br>

>> interested in to be worth optimizing for?<br>

>><br>

>> Peter<br>

>><br>

>> On Sun, Jul 21, 2019 at 3:54 PM Siddharth Bhat via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>

>>><br>

>>> Hello,<br>

>>><br>

>>> We would like to begin discussions around a new set of intrinsics, to better express<br>

>>> multi-dimensional array indexing within LLVM. The motivations and a possible design<br>

>>> are sketched out below.<br>

>>><br>

>>> Rendered RFC link here<br>

>>><br>

>>> Raw markdown:<br>

>>><br>

>>> # Introducing a new multidimensional array indexing intrinsic<br>

>>><br>

>>> ## The request for change: A new intrinsic, `llvm.multidim.array.index.*`.<br>

>>><br>

>>> We propose the addition of a new family of intrinsics, `llvm.multidim.array.index.*`.<br>

>>> This will allow us to represent array indexing into an array `A[d1][d2][..][dk]`<br>

>>> as `llvm.multidim.array.index.k A d1, d2, d3, ... dk` instead of flattening<br>

>>> the information into `gep A, d1 * n1 + d2 * n2 + ... + dk * nk`. The former<br>

>>> unflattened representation is advantageous for analysis and optimisation. It also<br>

>>> allows us to represent array indexing semantics of languages such as Fortran<br>

>>> and Chapel, which differs from that of C based array indexing.<br>

>>><br>

>>> ## Motivating the need for a multi-dimensional array indexing intrinsic<br>

>>><br>

>>> There are primarily one of two views of array indexing involving<br>

>>> multiple dimensions most languages take, which we discuss to motivate<br>

>>> the need for a multi-dimensional array index. This consideration impacts<br>

>>> the kinds of analysis we can perform on the program. In Polly, we care about<br>

>>> dependence analysis, so the examples here will focus on that particular problem.<br>

>>><br>

>>> Let us consider an array indexing operation of the form:<br>

>>> ```cpp<br>

>>> int ex1(int n, int m, B[n][m], int x1, int x2, int y1, int y2) {<br>

>>> __builtin_assume(x1 != x2);<br>

>>> __builtin_assume(y1 != y2);<br>

>>> B[x1][y1] = 1;<br>

>>> printf("%d", B[x2][y2]);<br>

>>> exit(0);<br>

>>> }<br>

>>> ```<br>

>>><br>

>>> One would like to infer that the array indices _interpreted as tuples_<br>

>>> `(x1, y1)` and `(x2, y2)` do not have the same value, due to the guarding asserts<br>

>>> that `x1 != x2` and `y1 != y2`. As a result, the write `B[x1][y1] = 1` can<br>

>>> in no way interfere with the value of `B[x2][y2]`. Consquently,<br>

>>> we can optimise the program into:<br>

>>><br>

>>><br>

>>> ```cpp<br>

>>> int ex1_opt(int n, int m, B[n][m], int x1, int x2, int y1, int y2) {<br>

>>>     // B[x1][y1] = 1 is omitted because the result<br>

>>>     // cannot be used:<br>

>>>     //  It is not used in the print and then the program exits<br>

>>> printf("%d", B[x2][y2]);<br>

>>> exit(0);<br>

>>> }<br>

>>> ```<br>

>>><br>

>>> However, alas, this is illegal, for the C language does not provide<br>

>>> semantics that allow the final inference above. It is conceivable that<br>

>>> `x1 != x2, y1 != y2`, but the indices do actually alias, since<br>

>>> according to C semantics, the two indices alias if the _flattened<br>

>>> representation of the indices alias_. Consider the parameter<br>

>>> values:<br>

>>><br>

>>> ```<br>

>>> n = m = 3<br>

>>> x1 = 1, y1 = 0; B[x1][y1] = nx1+y1 = 3*1+0=3<br>

>>> x2 = 0, y2 = 3; B[x2][y2] = nx2+y2 = 3*0+3=3<br>

>>> ```<br>

>>><br>

>>> Hence, the array elements `B[x1][y1]` and `B[x2][y2]` _can alias_, and<br>

>>> so the transformation proposed in `ex1_opt` is unsound in general.<br>

>>><br>

>>><br>

>>> In contrast, many langagues other than C require that index<br>

>>> expressions for multidimensional arrays have each component within the<br>

>>> array dimension for that component. As a result, in the example above,<br>

>>> the index pair `(0,3)` would be out-of-bounds. In languages with these<br>

>>> semantics, one can infer that the indexing:<br>

>>><br>

>>> `[x1][y1] != [x2][y2]` iff `x1 != x2 || y1 != y2`.<br>

>>><br>

>>> While this particular example is not very interesting, it shows the<br>

>>> spirit of the lack of expressiveness in LLVM we are trying to<br>

>>> improve.<br>

>>><br>

>>> Julia, Fortran, and Chapel are examples of such languages which target<br>

>>> LLVM.<br>

>>><br>

>>> Currently, the LLVM semantics of `getelementptr` talk purely of<br>

>>> the C style flattened views of arrays. This inhibits the ability of the optimiser<br>

>>> to understand the multidimensional examples as given above, and we<br>

>>> are forced to make conservative assumptions, inhibiting optimisation.<br>

>>> This information must ideally be expressible in LLVM, such that LLVM's optimisers and<br>

>>> alias analysis can use this information to model multidimensional-array<br>

>>> semantics.<br>

>>><br>

>>> There is a more realistic (and involved) example in [Appendix A](#Appendix-A)<br>

>>> in the same spirit as the above simple example, but one a compiler might<br>

>>> realistically wish to perform.<br>

>>><br>

>>> ## Evaluation of the impact of the intrinsic on accuracy of dependence analysis<br>

>>><br>

>>> - This has been implemented in an exprimental branch of Polly, and was used<br>

>>> on the COSMO climate weather model. This greatly helped increase the accuracy<br>

>>> of Polly's analysis, since we eliminated the guessing game from the array analysis.<br>

>>><br>

>>> - This has also been implemented as part of a GSoC effort to unify<br>

>>> Chapel and Polly.<br>

>>><br>

>>> - Molly, the distributed version of Polly written by Michael Kruse for his PhD<br>

>>>   also implemented a similar scheme. In his use-case, optimistic run-time checks<br>

>>>   with delinearization was not possible, so this kind of intrinsic was _necessary_<br>

>>>   for the application, not just _good to have_. More details are available<br>

>>>   in his PhD thesis: [Lattice QCD Optimization and Polytopic Representations of Distributed Memory](<a href="https://tel.archives-ouvertes.fr/tel-01078440)" rel="noreferrer" target="_blank">https://tel.archives-ouvertes.fr/tel-01078440)</a>.>>   In particular, Chapter 9 contains a detailed discussion.<br>

>>><br>

>>> # Representations<br>

>>><br>

>>> ## Intrinsic<br>

>>><br>

>>> ### Syntax<br>

>>> ```<br>

>>> <result> = llvm.multidim.array.index.* <ty> <ty>* <ptrval> {<stride>, <idx>}*<br>

>>> ```<br>

>>><br>

>>> ### Overview:<br>

>>><br>

>>> The `llvm.multidim.array.index.*` intrinsic is used to get the address of<br>

>>> an element from an array. It performs address calcuation only and<br>

>>> does not access memory. It is similar to `getelementptr`. However, it imposes<br>

>>> additional semantics which allows the optimiser to provide better optimisations<br>

>>> than `getlementptr`.<br>

>>><br>

>>><br>

>>><br>

>>> ### Arguments:<br>

>>><br>

>>> The first argument is always a type used as the basis for the calculations. The<br>

>>> second argument is always a pointer, and is the base address to start the<br>

>>> calculation from. The remaining arguments are a list of pairs. Each pair<br>

>>> contains a dimension stride, and an offset with respect to that stride.<br>

>>><br>

>>><br>

>>> ### Semantics:<br>

>>><br>

>>> `llvm.multidim.array.index.*` represents a multi-dimensional array index, In particular, this will<br>

>>> mean that we will assume that all indices `<idx_i>` are non-negative.<br>

>>><br>

>>> Additionally, we assume that, for each `<str_i>`, `<idx_i>` pair, that<br>

>>> `0 <= idx_i < str_i`.<br>

>>><br>

>>> Optimizations can assume that, given two llvm.multidim.array.index.* instructions with matching types:<br>

>>><br>

>>> ```<br>

>>> llvm.multidim.array.index.* <ty> <ty>* <ptrvalA> <strA_1>, <idxA_1>, ..., <strA_N>, <idxA_N><br>

>>> llvm.multidim.array.index.* <ty> <ty>* <ptrvalB> <strB_1>, <idxB_1>, ..., <strB_N>, <idxb_N><br>

>>> ```<br>

>>><br>

>>> If `ptrvalA == ptrvalB` and the strides are equal `(strA_1 == strB_1 && ... && strA_N == strB_N)` then:<br>

>>><br>

>>> - If all the indices are equal (that is, `idxA_1 == idxB_1 && ... && idxA_N == idxB_N`),<br>

>>>   then the resulting pointer values are equal.<br>

>>><br>

>>> - If any index value is not equal (that is, there exists an `i` such that `idxA_i != idxB_i`),<br>

>>>   then the resulting pointer values are not equal.<br>

>>><br>

>>><br>

>>> ##### Address computation:<br>

>>> Consider an invocation of `llvm.multidim.array.index.*`:<br>

>>><br>

>>> ```<br>

>>> <result> = call @llvm.multidim.array.index.* <ty> <ty>* <ptrval> <str_0>, <idx_0>, <str_1> <idx_1>, ..., <str_n> <idx_n><br>

>>> ```<br>

>>><br>

>>> If the pairs are denoted by `(str_i, idx_i)`, where `str_i` denotes the stride<br>

>>> and `idx_i` denotes the index of the ith pair, then the final address (in bytes)<br>

>>> is computed as:<br>

>>><br>

>>> ```<br>

>>> ptrval + len(ty) * [(str_0 * idx_0) + (str_1 * idx_1) + ... (str_n * idx_n)]<br>

>>> ```<br>

>>><br>

>>> ## Transitioning to `llvm.multidim.array.index.*`: Allow `multidim_array_index` to refer to a GEP instruction:<br>

>>><br>

>>> This is a sketch of how we might gradually introduce the `llvm.multidim.array.index.*`<br>

>>> intrinsic into LLVM without immediately losing the analyses<br>

>>> that are performed on `getelememtptr` instructions. This section<br>

>>> lists out some possible choices that we have, since the authors<br>

>>> do not have a "best" solution.<br>

>>><br>

>>> ##### Choice 1: Write a `llvm.multidim.array.index.*` to `GEP` pass, with the `GEP` annotated with metadata<br>

>>><br>

>>> This pass will flatten all `llvm.multidim.array.index.*` expressions into a `GEP` annotated with metadata. This metadata will indicate that the index expression computed by the lowered GEP is guaranteed to be in a canonical form which allows the analysis<br>

>>> to infer stride and index sizes.<br>

>>><br>

>>> A multidim index of the form:<br>

>>> ```<br>

>>> %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2, %idx_2<br>

>>> ```<br>

>>><br>

>>> is lowered to:<br>

>>><br>

>>> ```<br>

>>> %mul1 = mul nsw i64 %str_1, %idx_1<br>

>>> %mul2 = mul1 nsw i64 %str_2, %idx_2<br>

>>> %total = add nsw i64 %mul2, %mul1<br>

>>> %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total, !multidim !1<br>

>>> ```<br>

>>> with guarantees that the first term in each multiplication is the stride<br>

>>> and the second term in each multiplication is the index. (What happens<br>

>>> if intermediate transformation passes decide to change the order? This seems<br>

>>> complicated).<br>

>>><br>

>>> **TODO:** Lookup how to attach metadata such that the metadata can communicate<br>

>>> which of the values are strides and which are indeces<br>

>>><br>

>>><br>

>>><br>

>>> # Caveats<br>

>>><br>

>>> Currently, we assume that the array shape is immutable. However, we will need to deal with<br>

>>> being able to express `reshape` like primitives where the array shape can be mutated. However,<br>

>>> this appears to make expressing this information quite difficult: We now need to attach the shape<br>

>>> information to an array per "shape-live-range".<br>

>>><br>

>>><br>

>>> ## Appendix-A<br>

>>><br>

>>> ##### A realistic, more involved example of dependence analysis going wrong<br>

>>><br>

>>> ```cpp<br>

>>> // In an array A of size (n0 x n1),<br>

>>> // fill a subarray of size (s0 x s1)<br>

>>> // which starts at an offset (o0 x o1) in the larger array A<br>

>>> void set_subarray(unsigned n0, unsigned n1,<br>

>>> unsigned o0, unsigned o1,<br>

>>> unsigned s0, unsigned s1,<br>

>>> float A[n0][n1]) {<br>

>>> for (unsigned i = 0; i < s0; i++)<br>

>>> for (unsigned j = 0; j < s1; j++)<br>

>>> S: A[i + o0][j + o1] = 1;<br>

>>> }<br>

>>> ```<br>

>>> We first reduce this index expression to a sum of products:<br>

>>><br>

>>> ```<br>

>>> (i + o0) * n1 + (j + o1) = n1i + n1o0 + j + o1<br>

>>> ix(i, j, n0, n1, o0, o1) = n1i + n1o0 + j + o1<br>

>>> ```<br>

>>><br>

>>> `ix` is the index expression which `LLVM` will see, since it is fully<br>

>>> flattened, in comparison with the multi-dimensional index expression<br>

>>> `index:[i + o0][j + o1] | sizes:[][n1]`.<br>

>>><br>

>>> We will now show _why_ this multi-dimensional index is not always correct,<br>

>>> and why guessing for one language will not work for another:<br>

>>><br>

>>> Consider a call `set_subarray(n0=8, n1=9, o0=4, o1=6, s0=3, s1=6)`. At face<br>

>>> value, this is incorrect if we view the array as 2D grid, since the size<br>

>>> of the array is `(n0, n1) = (8, 9)`, but we are writing an array of size<br>

>>> `(s0, s1) = (3, 6)` starting from `(o0, o1) = (4, 6)`. Clearly, we will<br>

>>> exceed the width of the array, since `(s1 + o1 = 6 + 6 = 12) > (n1 = 9)`.<br>

>>> However, now think of the array as a flattened 1D representation. In this<br>

>>> case, the total size of the array is `n1xn2 = 8x9 = 72`, while the largest<br>

>>> element we will access is at the largest value of `(i, j)`. That is,<br>

>>> `i = s0 - 1 = 2`, and `j = s1 - 1 = 5`.<br>

>>><br>

>>> The largest index will be `ix(i=2, j=5, n0=8, n1=9, o0=4, o1=6) = 8*2+8*4+5+6=59`.<br>

>>> Since `59 < 72`, we are clearly at _legal_ array indices, by C semantics!<br>

>>><br>

>>> The definition of the semantics of the language **changed the illegal<br>

>>> multidimensional access** (which is illegal since it exceeds the `n1`<br>

>>> dimension), into a **legal flattened 1D access** (which is legal since the<br>

>>> flattened array indices are inbounds).<br>

>>><br>

>>> LLVM has no way of expressing these two different semantics. Hence, we are<br>

>>> forced to:<br>

>>> 1. Consider flattened 1D accesses, which makes analysis of index expressions<br>

>>> equivalent to analysis of polynomials over the integers, which is hard.<br>

>>> 2. Guess multidimensional representations, and use them at the expense of<br>

>>> soundness bugs as shown above.<br>

>>> 3. Guess multidimensional representations, use them, and check their validity<br>

>>> at runtime, causing a runtime performance hit. This implementation follows<br>

>>> the description from the paper [Optimistic Delinearization of Parametrically Sized Arrays](optim-delin-parametrically-sized-arrays).<br>

>>><br>

>>><br>

>>> Currently, Polly opts for option (3), which is to emit runtime checks. If<br>

>>> the run-time checks fail, then Polly will not run its optimised code. Instead,<br>

>>> It keeps a copy of the unoptimised code around, which is run in this case.<br>

>>> Note that this effectively doubles the amount of performance-sensitive code<br>

>>> which is finally emitted after running Polly.<br>

>>><br>

>>> Ideally, we would like a mechanism to directly express the multidimensional<br>

>>> semantics, which would eliminate this kind of guesswork from Polly/LLVM,<br>

>>> which would both make code faster, and easier to analyze.<br>

>>><br>

>>> ## References<br>

>>> - [The chapel language specification](<a href="https://chapel-lang.org/docs/1.13/_downloads/chapelLanguageSpec.pdf" rel="noreferrer" target="_blank">https://chapel-lang.org/docs/1.13/_downloads/chapelLanguageSpec.pdf</a>)>> - [Fortran 2003 standard](<a href="http://www.j3-fortran.org/doc/year/04/04-007.pdf%7D%7BFortran" rel="noreferrer" target="_blank">http://www.j3-fortran.org/doc/year/04/04-007.pdf}{Fortran</a> 2003 standard)<br>

>>> - [C++ subscripting](<a href="http://eel.is/c++draft/expr.sub" rel="noreferrer" target="_blank">http://eel.is/c++draft/expr.sub</a>)>> - [Michael Kruse's PhD thesis: Lattice QCD Optimization and Polytopic Representations of Distributed Memory](<a href="https://tel.archives-ouvertes.fr/tel-01078440)" rel="noreferrer" target="_blank">https://tel.archives-ouvertes.fr/tel-01078440)</a>.>> - [Molly source code link for the intrinsic](<a href="https://github.com/Meinersbur/llvm/blob/molly/include/llvm/IR/IntrinsicsMolly.td#L3" rel="noreferrer" target="_blank">https://github.com/Meinersbur/llvm/blob/molly/include/llvm/IR/IntrinsicsMolly.td#L3</a>)>> -[Optimistic Delinearization of Parametrically Sized Arrays](optim-delin-parametrically-sized-arrays)<br>

>>><br>

>>> [optim-delin-parametrically-sized-arrays]: <a href="http://delivery.acm.org/10.1145/2760000/2751248/p351-grosser.pdf?ip=93.3.109.183&id=2751248&acc=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1557948443_2f56675c6d04796f27b84593535c9f70" rel="noreferrer" target="_blank">http://delivery.acm.org/10.1145/2760000/2751248/p351-grosser.pdf?ip=93.3.109.183&id=2751248&acc=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1557948443_2f56675c6d04796f27b84593535c9f70</a>>><br>

>>><br>

>>> _______________________________________________<br>

>>> LLVM Developers mailing list<br>

>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

>>> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>><br>

>><br>

>><br>

>> --<br>

>> --<br>

>> Peter<br>

>> _______________________________________________<br>

>> LLVM Developers mailing list<br>

>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

>> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev_______________________________________________" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev_______________________________________________</a><br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>