[llvm-dev] [RFC] Extending shufflevector for vscale vectors (SVE etc.)

Thu Feb 13 15:47:10 PST 2020

> On Feb 10, 2020, at 7:31 AM, Sander De Smalen <Sander.DeSmalen at arm.com> wrote:
> 
>> On 8 Feb 2020, at 01:52, Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> Some of this is what eventually led to the way we lower llvm.vscale: we have the intrinsic, but we have a hack in codegenprepare to convert it to a GEP constant expression.
>> Since you have the GEP trick that works with variable length vectors, why do you need llvm.vscale anymore at all?
> As far as I understood it, it would be dependent on the target's DataLayout whether the GEP trick would be a valid transformation because of vector alignment (https://reviews.llvm.org/D71636#1792473).

Sure, but GEP is DataLayout specific in general.  I am not an expert on this, but isn't this just a question of generating the right IR?

>> If you take a step back and look at the structure of the problem we’re working with, the existing ShuffleVector design is very problematic IMO.  Among other things, knowing that shuffle output elements are undef but not knowing that about calls and loads of vectors is pretty weird.
> The masked load intrinsics have a predicate and a passthru operand. Is that not what you mean?

Sure, that’s one example of this.  The general issue is that “reasoning about undef vector element results” is not specific to shuffles.  It applies equally to other operations as well.

>> Why don’t we have something like the !range metadata that applies to load/call/shuffle to indicate that the result vector elements have some unknown elements?  This would be far more general and correct.
> Instead of adding a new attribute, can we use a `select` (or perhaps a new intrinsic) to express which lanes are undef? That way the same mechanism would also work for scalable vectors without having to figure out some convoluted way to describe that in metadata.
> 
> At the moment the `select` would be thrown away by InstCombine which folds `select %p, %v, undef -> %v`. I'm not sure to what extend this is desirable; it is useful to be folded away for certain purposes but not when you want to explicitly specify that something can be undef. Perhaps a new intrinsic would be more appropriate?

The goal here is to keep scalable and fixed vectors as similar as possible, while not allowing the mechanisms to regress fixed vectors.  I don’t think that select will do that.  We’re talking vector-element level undef analysis, not undef of the whole vector.

> 
>>> On Feb 7, 2020, at 4:42 PM, Eli Friedman <efriedma at quicinc.com> wrote:
>>>> The other aspect here is the use of "undef" in fixed shuffles.  Frontends usually don't do this, but some of our optimizations for shuffles are built around replacing unused results of shuffles with "undef".  We don't want to forbid transforming "<0, 4, 1, 5>" -> "<0, undef, 1, 5>" because we decided the former was the named shuffle "zip", and we want to allow transforms/analysis for zip-like shuffles on "<0, undef, 1, 5>".  So I think it ends up being more straightforward to have a helper for "is this a zip", as opposed to a helper to transforming a "zip" to a fixed shuffle mask.
> For some shufflemask like `<0, undef, undef, undef>`, I guess this means that 'isSplat' and 'isZipLo' would both return true?

Yes.  That’s already generally true of the target isel stuff.

> Just recapping the discussion here for my understanding (please correct me where I'm wrong):
> 
> - The original proposal is to extend `shufflevector` to take either a 'shuffle pattern' or a 'shuffle mask'.
>  - For fixed-width shuffles it will retain the mask, in case any of the elements is 'undef' (so not to lose that information)
>  - ShuffleVectorInst is extended with methods to query if something e.g. is a zip/splat/etc.
> 
> - An alternative would be to not unify these into a single `shufflevector`, but rather have multiple operations.
>  - InstCombine can segment shuffles into two exclusive sets of operations: known shuffle patterns vs unknown pattern with explicit mask
>  - To avoid multiple representations for the same shuffle, we want to map to exclusively either to 'explicit mask' or 'known mask' operations.
>    But in order not to lose information about lanes being undef, we'd then need to model the 'undef' lanes separate from the instruction.

Right.

> For either route, does this mean we'll also want to consider supporting data-dependent shuffle masks? (I don't see why we wouldn't support those if we go through the lengths of adding support for more types of shuffles).

I’m saying that this is a likely future, but I would really prefer NOT to lump it into the same discussion.  It is just useful to keep the bigger context in mind while talking about specific points.

-Chris