[PATCH] D141924: [IR] Add new intrinsics interleave and deinterleave vectors

Mon Jan 23 18:20:23 PST 2023

paulwalker-arm added a comment.

Hi @CarolineConcatto  & @sdesmalen, I've a couple of simplifications for you to consider which I believe makes things easier to work with.  Perhaps simplifications is the wrong word but I do think they'll introduce more uniformity to the design.

Intrinsics:
What about implementing total shuffles and breaking the dependence on vector types having any meaning, with this encoded as a discrete immediate operand instead. For example:

  Ty A = @llvm.experimental.vector.interleave.Ty(Vec, shape)
  Ty B = @llvm.experimental.vector.deinterleave.Ty(Vec, shape)

Here the intrinsics are simply vector in vector out with all input lanes existing in the output just at a different location (this is what I mean by a total shuffle).  If only part of the result is important to the caller then they'll just extract the part they need. Here `shape` essentially refers to the number of subvectors that are logically contained within Vec(interleave) or B(deinterleave) and for this initial implementation we'd restrict support to just the value `2`.  The main usage rule is `Ty.getKnownMinElementCount()` must be devisable by `shape`.

Do you see any issues here? My thinking is that it becomes trivial to see how we'd support other strides (i.e. we'd just extend the verifier to allow shape=new-stride).

CodeGen:
As you know the vector types here are critical because we must be able to legalise all supported variants.  I prefer how you've defined `ISD::VECTOR_INTERLEAVE` over `ISD::VECTOR_DEINTERLEAVE` because it represents a total shuffle.

However, my proposal is to match the above intrinsic interface but replace the single vector in/out rule with one that dictates the ISD nodes must have a matching number of vector inputs and outputs with all having the same type.  The shape operand remains as is and the operations are defined to first concatenate all N vector operands, perform the necessary shuffle (based on the shape), before the result is then evenly split into N vectors.

The important thing here is that `shape` does not dictate anything about the number of vectors. Once all type legalisation is in place I'd expect a simple mapping from intrinsic to ISD node. After type legalisation the vector counts might change but `shape` does not. I think the shape gives enough information to guide type/operation legalisation in the best order to split, promote or widen the vectors?

I don't think this deviates much from your current design but does provide more extensibility. Given this patch is not worrying about type legalisation the biggest change is likely to be to the operation descriptions, but I'd appreciate a little tire kicking to see if I'm misrepresenting the benefits. What do you think?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141924/new/

https://reviews.llvm.org/D141924