[llvm-dev] [RFC] A new multidimensional array indexing intrinsic
Doerfert, Johannes via llvm-dev
llvm-dev at lists.llvm.org
Mon Jul 22 11:55:44 PDT 2019
On 07/22, Philip Reames via llvm-dev wrote:
> We could also simply extend the existing inrange mechanism to
> non-constantexpr GEPs. It would remove an inconsistency in the semantics,
> be relatively straight forward, and solve the motivating example.
Agreed. I'd very much prefer a GEP extension over an intrinsic solution.
--
Johannes
> On 7/22/19 10:01 AM, Peter Collingbourne via llvm-dev wrote:
> > The restrictions of this new intrinsic seem very similar to those of the
> > inrange attribute on GEPs. It seems that the main advantage of your
> > proposal is that it would allow for non-constant strides (i.e. variable
> > length arrays) in dimensions other than the first one. Do these appear
> > frequently enough in the programs that you're interested in to be worth
> > optimizing for?
> >
> > Peter
> >
> > On Sun, Jul 21, 2019 at 3:54 PM Siddharth Bhat via llvm-dev
> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> >
> > Hello,
> >
> > We would like to begin discussions around a new set of intrinsics,
> > to better express
> > multi-dimensional array indexing within LLVM. The motivations and
> > a possible design
> > are sketched out below.
> >
> > Rendered RFC link here
> > <https://github.com/bollu/llvm-multidim-array-indexing-proposal/blob/master/RFC.md>
> >
> >
> > Raw markdown:
> >
> > # Introducing a new multidimensional array indexing intrinsic
> >
> > ## The request for change: A new intrinsic, `llvm.multidim.array.index.*`.
> >
> > We propose the addition of a new family of intrinsics, `llvm.multidim.array.index.*`.
> > This will allow us to represent array indexing into an array `A[d1][d2][..][dk]`
> > as `llvm.multidim.array.index.k A d1, d2, d3, ... dk` instead of flattening
> > the information into `gep A, d1 * n1 + d2 * n2 + ... + dk * nk`. The former
> > unflattened representation is advantageous for analysis and optimisation. It also
> > allows us to represent array indexing semantics of languages such as Fortran
> > and Chapel, which differs from that of C based array indexing.
> >
> > ## Motivating the need for a multi-dimensional array indexing intrinsic
> >
> > There are primarily one of two views of array indexing involving
> > multiple dimensions most languages take, which we discuss to motivate
> > the need for a multi-dimensional array index. This consideration impacts
> > the kinds of analysis we can perform on the program. In Polly, we care about
> > dependence analysis, so the examples here will focus on that particular problem.
> >
> > Let us consider an array indexing operation of the form:
> > ```cpp
> > int ex1(int n, int m, B[n][m], int x1, int x2, int y1, int y2) {
> > __builtin_assume(x1 != x2);
> > __builtin_assume(y1 != y2);
> > B[x1][y1] = 1;
> > printf("%d", B[x2][y2]);
> > exit(0);
> > }
> > ```
> >
> > One would like to infer that the array indices _interpreted as tuples_
> > `(x1, y1)` and `(x2, y2)` do not have the same value, due to the guarding asserts
> > that `x1 != x2` and `y1 != y2`. As a result, the write `B[x1][y1] = 1` can
> > in no way interfere with the value of `B[x2][y2]`. Consquently,
> > we can optimise the program into:
> >
> >
> > ```cpp
> > int ex1_opt(int n, int m, B[n][m], int x1, int x2, int y1, int y2) {
> > // B[x1][y1] = 1 is omitted because the result
> > // cannot be used:
> > // It is not used in the print and then the program exits
> > printf("%d", B[x2][y2]);
> > exit(0);
> > }
> > ```
> >
> > However, alas, this is illegal, for the C language does not provide
> > semantics that allow the final inference above. It is conceivable that
> > `x1 != x2, y1 != y2`, but the indices do actually alias, since
> > according to C semantics, the two indices alias if the _flattened
> > representation of the indices alias_. Consider the parameter
> > values:
> >
> > ```
> > n = m = 3
> > x1 = 1, y1 = 0; B[x1][y1] = nx1+y1 = 3*1+0=3
> > x2 = 0, y2 = 3; B[x2][y2] = nx2+y2 = 3*0+3=3
> > ```
> >
> > Hence, the array elements `B[x1][y1]` and `B[x2][y2]` _can alias_, and
> > so the transformation proposed in `ex1_opt` is unsound in general.
> >
> >
> > In contrast, many langagues other than C require that index
> > expressions for multidimensional arrays have each component within the
> > array dimension for that component. As a result, in the example above,
> > the index pair `(0,3)` would be out-of-bounds. In languages with these
> > semantics, one can infer that the indexing:
> >
> > `[x1][y1] != [x2][y2]` iff `x1 != x2 || y1 != y2`.
> >
> > While this particular example is not very interesting, it shows the
> > spirit of the lack of expressiveness in LLVM we are trying to
> > improve.
> >
> > Julia, Fortran, and Chapel are examples of such languages which target
> > LLVM.
> >
> > Currently, the LLVM semantics of `getelementptr` talk purely of
> > the C style flattened views of arrays. This inhibits the ability of the optimiser
> > to understand the multidimensional examples as given above, and we
> > are forced to make conservative assumptions, inhibiting optimisation.
> > This information must ideally be expressible in LLVM, such that LLVM's optimisers and
> > alias analysis can use this information to model multidimensional-array
> > semantics.
> >
> > There is a more realistic (and involved) example in [Appendix A](#Appendix-A)
> > in the same spirit as the above simple example, but one a compiler might
> > realistically wish to perform.
> >
> > ## Evaluation of the impact of the intrinsic on accuracy of dependence analysis
> >
> > - This has been implemented in an exprimental branch of Polly, and was used
> > on the COSMO climate weather model. This greatly helped increase the accuracy
> > of Polly's analysis, since we eliminated the guessing game from the array analysis.
> >
> > - This has also been implemented as part of a GSoC effort to unify
> > Chapel and Polly.
> >
> > - Molly, the distributed version of Polly written by Michael Kruse for his PhD
> > also implemented a similar scheme. In his use-case, optimistic run-time checks
> > with delinearization was not possible, so this kind of intrinsic was _necessary_
> > for the application, not just _good to have_. More details are available
> > in his PhD thesis: [Lattice QCD Optimization and Polytopic Representations of Distributed Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> > In particular, Chapter 9 contains a detailed discussion.
> >
> > # Representations
> >
> > ## Intrinsic
> >
> > ### Syntax
> > ```
> > <result> = llvm.multidim.array.index.* <ty> <ty>* <ptrval> {<stride>, <idx>}*
> > ```
> >
> > ### Overview:
> >
> > The `llvm.multidim.array.index.*` intrinsic is used to get the address of
> > an element from an array. It performs address calcuation only and
> > does not access memory. It is similar to `getelementptr`. However, it imposes
> > additional semantics which allows the optimiser to provide better optimisations
> > than `getlementptr`.
> >
> >
> >
> > ### Arguments:
> >
> > The first argument is always a type used as the basis for the calculations. The
> > second argument is always a pointer, and is the base address to start the
> > calculation from. The remaining arguments are a list of pairs. Each pair
> > contains a dimension stride, and an offset with respect to that stride.
> >
> >
> > ### Semantics:
> >
> > `llvm.multidim.array.index.*` represents a multi-dimensional array index, In particular, this will
> > mean that we will assume that all indices `<idx_i>` are non-negative.
> >
> > Additionally, we assume that, for each `<str_i>`, `<idx_i>` pair, that
> > `0 <= idx_i < str_i`.
> >
> > Optimizations can assume that, given two llvm.multidim.array.index.* instructions with matching types:
> >
> > ```
> > llvm.multidim.array.index.* <ty> <ty>* <ptrvalA> <strA_1>, <idxA_1>, ..., <strA_N>, <idxA_N>
> > llvm.multidim.array.index.* <ty> <ty>* <ptrvalB> <strB_1>, <idxB_1>, ..., <strB_N>, <idxb_N>
> > ```
> >
> > If `ptrvalA == ptrvalB` and the strides are equal `(strA_1 == strB_1 && ... && strA_N == strB_N)` then:
> >
> > - If all the indices are equal (that is, `idxA_1 == idxB_1 && ... && idxA_N == idxB_N`),
> > then the resulting pointer values are equal.
> >
> > - If any index value is not equal (that is, there exists an `i` such that `idxA_i != idxB_i`),
> > then the resulting pointer values are not equal.
> >
> >
> > ##### Address computation:
> > Consider an invocation of `llvm.multidim.array.index.*`:
> >
> > ```
> > <result> = call @llvm.multidim.array.index.* <ty> <ty>* <ptrval> <str_0>, <idx_0>, <str_1> <idx_1>, ..., <str_n> <idx_n>
> > ```
> >
> > If the pairs are denoted by `(str_i, idx_i)`, where `str_i` denotes the stride
> > and `idx_i` denotes the index of the ith pair, then the final address (in bytes)
> > is computed as:
> >
> > ```
> > ptrval + len(ty) * [(str_0 * idx_0) + (str_1 * idx_1) + ... (str_n * idx_n)]
> > ```
> >
> > ## Transitioning to `llvm.multidim.array.index.*`: Allow `multidim_array_index` to refer to a GEP instruction:
> >
> > This is a sketch of how we might gradually introduce the `llvm.multidim.array.index.*`
> > intrinsic into LLVM without immediately losing the analyses
> > that are performed on `getelememtptr` instructions. This section
> > lists out some possible choices that we have, since the authors
> > do not have a "best" solution.
> >
> > ##### Choice 1: Write a `llvm.multidim.array.index.*` to `GEP` pass, with the `GEP` annotated with metadata
> >
> > This pass will flatten all `llvm.multidim.array.index.*` expressions into a `GEP` annotated with metadata. This metadata will indicate that the index expression computed by the lowered GEP is guaranteed to be in a canonical form which allows the analysis
> > to infer stride and index sizes.
> >
> > A multidim index of the form:
> > ```
> > %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2, %idx_2
> > ```
> >
> > is lowered to:
> >
> > ```
> > %mul1 = mul nsw i64 %str_1, %idx_1
> > %mul2 = mul1 nsw i64 %str_2, %idx_2
> > %total = add nsw i64 %mul2, %mul1
> > %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total, !multidim !1
> > ```
> > with guarantees that the first term in each multiplication is the stride
> > and the second term in each multiplication is the index. (What happens
> > if intermediate transformation passes decide to change the order? This seems
> > complicated).
> >
> > **TODO:** Lookup how to attach metadata such that the metadata can communicate
> > which of the values are strides and which are indeces
> >
> >
> >
> > # Caveats
> >
> > Currently, we assume that the array shape is immutable. However, we will need to deal with
> > being able to express `reshape` like primitives where the array shape can be mutated. However,
> > this appears to make expressing this information quite difficult: We now need to attach the shape
> > information to an array per "shape-live-range".
> >
> >
> > ## Appendix-A
> >
> > ##### A realistic, more involved example of dependence analysis going wrong
> >
> > ```cpp
> > // In an array A of size (n0 x n1),
> > // fill a subarray of size (s0 x s1)
> > // which starts at an offset (o0 x o1) in the larger array A
> > void set_subarray(unsigned n0, unsigned n1,
> > unsigned o0, unsigned o1,
> > unsigned s0, unsigned s1,
> > float A[n0][n1]) {
> > for (unsigned i = 0; i < s0; i++)
> > for (unsigned j = 0; j < s1; j++)
> > S: A[i + o0][j + o1] = 1;
> > }
> > ```
> > We first reduce this index expression to a sum of products:
> >
> > ```
> > (i + o0) * n1 + (j + o1) = n1i + n1o0 + j + o1
> > ix(i, j, n0, n1, o0, o1) = n1i + n1o0 + j + o1
> > ```
> >
> > `ix` is the index expression which `LLVM` will see, since it is fully
> > flattened, in comparison with the multi-dimensional index expression
> > `index:[i + o0][j + o1] | sizes:[][n1]`.
> >
> > We will now show _why_ this multi-dimensional index is not always correct,
> > and why guessing for one language will not work for another:
> >
> > Consider a call `set_subarray(n0=8, n1=9, o0=4, o1=6, s0=3, s1=6)`. At face
> > value, this is incorrect if we view the array as 2D grid, since the size
> > of the array is `(n0, n1) = (8, 9)`, but we are writing an array of size
> > `(s0, s1) = (3, 6)` starting from `(o0, o1) = (4, 6)`. Clearly, we will
> > exceed the width of the array, since `(s1 + o1 = 6 + 6 = 12) > (n1 = 9)`.
> > However, now think of the array as a flattened 1D representation. In this
> > case, the total size of the array is `n1xn2 = 8x9 = 72`, while the largest
> > element we will access is at the largest value of `(i, j)`. That is,
> > `i = s0 - 1 = 2`, and `j = s1 - 1 = 5`.
> >
> > The largest index will be `ix(i=2, j=5, n0=8, n1=9, o0=4, o1=6) = 8*2+8*4+5+6=59`.
> > Since `59 < 72`, we are clearly at _legal_ array indices, by C semantics!
> >
> > The definition of the semantics of the language **changed the illegal
> > multidimensional access** (which is illegal since it exceeds the `n1`
> > dimension), into a **legal flattened 1D access** (which is legal since the
> > flattened array indices are inbounds).
> >
> > LLVM has no way of expressing these two different semantics. Hence, we are
> > forced to:
> > 1. Consider flattened 1D accesses, which makes analysis of index expressions
> > equivalent to analysis of polynomials over the integers, which is hard.
> > 2. Guess multidimensional representations, and use them at the expense of
> > soundness bugs as shown above.
> > 3. Guess multidimensional representations, use them, and check their validity
> > at runtime, causing a runtime performance hit. This implementation follows
> > the description from the paper [Optimistic Delinearization of Parametrically Sized Arrays](optim-delin-parametrically-sized-arrays).
> >
> >
> > Currently, Polly opts for option (3), which is to emit runtime checks. If
> > the run-time checks fail, then Polly will not run its optimised code. Instead,
> > It keeps a copy of the unoptimised code around, which is run in this case.
> > Note that this effectively doubles the amount of performance-sensitive code
> > which is finally emitted after running Polly.
> >
> > Ideally, we would like a mechanism to directly express the multidimensional
> > semantics, which would eliminate this kind of guesswork from Polly/LLVM,
> > which would both make code faster, and easier to analyze.
> >
> > ## References
> > - [The chapel language specification](https://chapel-lang.org/docs/1.13/_downloads/chapelLanguageSpec.pdf)
> > - [Fortran 2003 standard](http://www.j3-fortran.org/doc/year/04/04-007.pdf}{Fortran <http://www.j3-fortran.org/doc/year/04/04-007.pdf%7D%7BFortran> 2003 standard)
> > - [C++ subscripting](http://eel.is/c++draft/expr.sub)
> > - [Michael Kruse's PhD thesis: Lattice QCD Optimization and Polytopic Representations of Distributed Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> > - [Molly source code link for the intrinsic](https://github.com/Meinersbur/llvm/blob/molly/include/llvm/IR/IntrinsicsMolly.td#L3)
> > -[Optimistic Delinearization of Parametrically Sized Arrays](optim-delin-parametrically-sized-arrays)
> >
> > [optim-delin-parametrically-sized-arrays]:http://delivery.acm.org/10.1145/2760000/2751248/p351-grosser.pdf?ip=93.3.109.183&id=2751248&acc=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1557948443_2f56675c6d04796f27b84593535c9f70
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> >
> >
> > --
> > --
> > Peter
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--
Johannes Doerfert
Researcher
Argonne National Laboratory
Lemont, IL 60439, USA
jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190722/af13e77d/attachment.sig>
More information about the llvm-dev
mailing list