[PATCH] D32530: [SVE][IR] Scalable Vector IR Type

Mon Apr 15 06:46:24 PDT 2019

huntergr added a comment.

In D32530#1464722 <https://reviews.llvm.org/D32530#1464722>, @hsaito wrote:

> In D32530#1463735 <https://reviews.llvm.org/D32530#1463735>, @joelkevinjones wrote:
>
> > I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.
>
>
> I think there is one more thing we still have to do. Does scalable vector type apply to all Instructions where non-scalable vector is allowed? If the answer is no, we need to identify which ones are not allowed to take scalable vector type operand/result. Some of the Instructions are not plain element-wise operation. Do we have agreed upon semantics for all those that are allowed?

The main difference is for 'shufflevector'. For a splat it's simple, since you just use a zeroinitializer mask. For anything else, though, you currently need a constant vector with immediate values; this obviously won't work if you don't know how many elements there are.

Downstream, we solve this by allowing shufflevector masks to be ConstantExprs, and then using 'vscale' and 'stepvector' as symbolic Constant nodes. With those and a few arithmetic and logical ops, we can synthesize the usual set of shuffles (reverse, top/bottom half, odd/even, interleaves, zips, etc). Would also work for fixed-length vectors. There's been some pushback on introducing them as symbolic constants though, and the initial demo patch set has them as intrinsics.

So if we wanted to keep them as intrinsics for now, I think we have one of three options:

1. Leave discussion on more complicated shuffles until later, and only use scalable autovectorization on loops which don't need anything more than splats.
2. Introduce additional intrinsics for the other shuffle variants as needed
3. Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

The discussion at the EuroLLVM roundtable was leaning towards option 1, with an action on me to provide a set of canonical shuffle examples using vscale and stepvector for community consideration.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D32530/new/

https://reviews.llvm.org/D32530