[llvm-dev] [RFC] A new multidimensional array indexing intrinsic

Wed Aug 7 21:00:19 PDT 2019

On Aug 2, 2019, at 4:18 PM, Michael Kruse <llvmdev at meinersbur.de> wrote:
>> Are you interested in the C family, or another language (e.g. fortran) that has multidimensional arrays as a first class concept?  I can see how this is useful for the later case, but in the former case you’d just be changing where you do the pattern matching, right?
> 
> Both. Transformations on LLVM-IR have the advantage to be applicable
> on all languages that have an LLVM-IR codegen. The motivating case
> here is Chapel. For Fortran we could do the transformations on MLIR,
> but I do not see that it is a lot easier on LLVM (modulo allocatable
> arrays) if the front-end does not emit affine.for.

Well sure, if you try to do this on an LLVM IR level abstraction, then of course your memory layout is pinned :-).  The payoff of using something like MLIR is to use a higher level abstraction, one which treats multidimensional arrays as a first class object which doesn’t have an overly constrained layout, and lower it to something more constrained after optimizations.  This is what all the ML frameworks do.

> For C/C++ we could
> add a language extension (builtin, attribute, ...) that allows
> conveying this information to the mid-end. Even if not, we do want to
> optimize applications that are written in C/C++.

The major advantage of using MLIR for this (for something like C, given that you did the work for a higher level language to define the multidimensional abstraction of your choice) is that it provides that abstraction to raise into - which doesn’t have to be all or nothing.  A lot of the challenges that some systems get into is that they work very hard to fit the world into their abstraction, but if it doesn’t fit even for a tiny reason, the entire acceleration structure fails.

In any case, I understand that MLIR isn’t an option for you here, and I’m not trying to sell you on it, just curious how you’re thinking about these things.

>>> The other feedback in this thread was mostly against using an
>>> intrinsic. Would you prefer starting with an intrinsic or would the
>>> other suggested approaches be fine as well?
>> 
>> Which other approach are you referring to?  I’d pretty strong prefer *not* to add an instruction for this.  It is generally better to start things out as experimental intrinsics, get experience with them, then if they make sense to promote them to instructions.
> 
> This was already my argument in this thread.
> 
> Suggestions I found in this thread:
> * Use llvm.assume to compare a GEP result ptr with the multidimensional one
> * Extending GEP with more arguments
> * Extend GEP in another unspecified way
> * Add metadata (such as operand bundles) to GEP that can be dropped
> (like MDNode)
> * Add metadata to memory accesses
> * Add a dynamically sized array type

IMO we should be less precious about experimental intrinsics and the bar should be somewhat low - the implementation needs to follow best practices (e.g. re: documentation) but I think we can afford some experimentation here, perhaps with a specified way to eject stuff that doesn’t work out.

-Chris