[llvm-dev] [RFC] Extending shufflevector for vscale vectors (SVE etc.)

Fri Feb 7 12:39:14 PST 2020

> -----Original Message-----
> From: Chris Lattner <clattner at nondot.org>
> Sent: Wednesday, February 5, 2020 4:02 PM
> To: Eli Friedman <efriedma at quicinc.com>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>
> Subject: [EXT] Re: [llvm-dev] [RFC] Extending shufflevector for vscale vectors
> (SVE etc.)
>
> On Jan 29, 2020, at 4:48 PM, Eli Friedman via llvm-dev <llvm-
> dev at lists.llvm.org> wrote:
> >
> > Currently, for scalable vectors, only splat shuffles are allowed; we're
> considering allowing more different kinds of shuffles.  The issue is,
> essentially, that a shuffle mask is a simple list of integers, and that isn't
> enough to express a scalable operation.  For example, concatenating two
> fixed-length vectors currently looks like this:
> >
> > Proposed IR syntax:
> >
> > %result = shufflevector <vscale x 4 x i32> %v1, <vscale x 4 x i32> %v2,
> SHUFFLE_NAME
> >
> > Alternatives:
> >
> > Instead of extending shufflevector, we could introduce a dedicated
> intrinsic for each common shuffle.  This is less readable, and makes it harder
> to leverage existing code that reasons about shuffles.  But it would mean
> fewer changes to existing code.
>
> Hi Eli,
>
> Did you consider a design point between these two extremes?  You could
> introduce one new instruction, something like “fixed shuffle vector” that
> takes two vectors and an enum.  That would keep the structurally (and
> representationally) different cases as separate instructions, without creating
> a new instruction per fixed shuffle kind.

Well, there are sort of two forms of this.  I could add a new instruction, or I could add a new intrinsic (maybe using a metadata string to specify the shuffle).  An instruction is a ton of boilerplate.  And an intrinsic means we don't get shufflevector constant expressions, which are useful for optimization.

Either way, it's a bunch of extra work if we intend to eventually unify the two.  I don't see any scenario under which we don't want to eventually unify them.  The operations I'm adding are semantically the same as the equivalent fixed-width shuffles; we just can't represent the shuffle masks the same way.  And I think if we do end up changing the representation of scalable shufflevectors later, we'll be able to autoupgrade the existing ones.

I think I can keep the initial patch relatively small if we wrap the abstract notion of a "ShuffleMask", which is either a fixed shuffle or a named scalable shuffle, in a C++ class.  And then we can let optimizations that expect fixed shuffles just convert that to the expected ArrayRef<int>.

> Relatedly, how do you foresee canonicalization (in instcombine and inst
> selection) working for these?  If not for compatibility, it would make sense to
> canonicalize from shuffle vector to the ‘fixed’ formats, but doing that would
> probably introduce a bunch of regressions for various targets.

I'm thinking that we don't use the new named shuffles for fixed-width shuffles at the IR level.  Instead, we add helpers to ShuffleVectorInst that match either a scalable shuffle, or an equivalent fixed shuffle.  So code that wants to handle both can pretend they're canonicalized, and code that handles fixed shuffles won't be disrupted.

In SelectionDAG, shuffles aren't really unified the same way they are in IR.  I think we map onto existing operations where they make sense (CONCAT_VECTORS and EXTRACT_SUBVECTOR).  For scalable zip/unzip, I haven't really thought deeply about it; it probably makes sense to change SHUFFLE_VECTOR's shuffle mask to use ShuffleMask like IR, but that probably doesn't have a big impact either way.

-Eli