[llvm-dev] [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

Mon Oct 15 04:04:10 PDT 2018

Hi Renato,

Thanks for taking a look.

> On 11 Oct 2018, at 15:57, Renato Golin <renato.golin at linaro.org> wrote:
> 
> Hi Graham,
> 
> On Thu, 11 Oct 2018 at 15:14, Graham Hunter <Graham.Hunter at arm.com> wrote:
>>  1. I've changed Type::getPrimitiveSizeInBits to return 0 for vector types
>>     and changed all uses of it with vector types to use
>>     Type::getScalableSizeInBits instead, following the design in the RFC.
>> 
>>     While this shows where getPrimitiveSizeInBits is used with vector types,
>>     I think it would be better for now to change it back to avoid breaking
>>     existing targets and put an assert in it to ensure that it's only used
>>     on non-scalable vectors. We can revisit the decision later once
>>     scalable vector support is more mature. Thoughts?
> 
> Another solution would be to make it return ScalableSize.Unscaled. At
> least in a transition period.

True, though there are places in the code that expect a size of 0 to mean
"this is a pointer", so using scalable vectors with that could lead to
incorrect code being generated instead of an obvious ICE.

>>  2. There are two implementations of ScalableSize; one in Type.h, and one
>>     in DataLayout.h. I'd prefer to only have one, but the former reports
>>     sizes as 32bits while the latter uses 64bits.
>> 
>>     I think changing all size queries to use 64bits is the best way to
>>     resolve it -- are there any significant problems with that approach
>>     aside from lots of code churn?
>> 
>>     It would also be possible to use templates and typedefs, but I figure
>>     unifying size reporting would be better.
> 
> Agreed.
> 
> 
>>  3. I have only implemented 'strict' comparisons for now, which leads to
>>     some possibly-surprising results; {X, 0} compared with {0, X} will
>>     return false for both '==' and '<' comparisons, but true for '<='.
>> 
>>     I think that supporting 'maybe' results from overloaded operators
>>     would be a bad idea, so if/when we find cases where they are needed
>>     then I think new functions should be written to cover those cases
>>     and only used where it matters. For simple things like stopping
>>     casts between scalable and non-scalable vectors the strict
>>     comparisons should suffice.
> 
> How do you differentiate between maybe and certain?

This work is biased towards 'true' being valid if and only if the condition
holds for all possible values of vscale. This does mean that returning a
'false' in some cases may be incorrect, since the result could be true for
some (but not all) vscale values.

I don't know if 'maybe' results are useful on their own yet.

> Asserts making sure you never compare scalable with non-scalable in
> the wrong way would be heavy handed, but are the only sure way to
> avoid this pitfall.
> 
> A handler to make those comparisons safe (for example, returning
> safety breach via argument pointer) would be lighter, but require big
> code changes and won't work with overloaded operators.

My initial intention was for most existing code (especially in target
specific code for targets without scalable vectors) to continue using
the unscaled-only interfaces; there's also common code which is guarded
by a check for scalar types before querying size. I haven't counted up
all the cases that would need to change, but the majority will be fine
as is.

Do you think that implementing the comparisons without operator overloading
would be preferable? I know that APInt does this, so it wouldn't be
unprecedented in the codebase -- I was just trying to fit the existing code
without changing too much, but maybe that's the wrong approach.

Either passing in a pointer as you suggest, or returning an 'ErrorOr<bool>'
as a result would allow appropriate boolean results through and require
the calling code to handle 'maybes' (which could just mean bailing out of
whatever transformation that was about to be performed).

I'll take a look through some uses of DataLayout to see how well that would
work.

>>  4. Alignment for structs containing scalable types is tricky. For now,
>>     I've added an assert to force all structs containing scalable vectors
>>     to be packed.
> 
> I take it by "alignment" you mean element size (== structure size),
> not structure alignment, which IIUC, only depends on the ABI.

I mean alignment of elements within a struct, which does indeed determine
structure size.

> I remember vaguely that scalable vectors' alignment in memory is the
> same as the unit vector's, and the unit vector is known at compile
> time, just not the multiplicity.
> 
> Did I get that wrong?

That's correct, but data vectors (Z registers) and predicate vectors
(P registers) have different unit vector sizes: 128bits vs 16bits,
respectively.

We could insist that predicate vectors take up the same space as data
vectors, but that will waste some space.

>>     It won't be possible to calculate correct offsets at compile time if
>>     the minimum size of a struct member isn't a multiple of the required
>>     alignment for the subsequent element(s).
> 
> I assume this would be either an ABI decision or an extension to the
> standard, but we can re-use C99's VLA concepts, only here it's the
> element size that is unknown, not just the element count.
> 
> This would keep the costs of unknown offsets until runtime to a minimal.

Sure, it's something to handle at the ABI level, so I'd like to know if
RVV or NEC's vector architecture have any special requirements here.

I would hope that sufficient advice to the programmer would avoid this
being a common problem and predicate vectors were always placed after
data vectors, but we do need to make sure it will work the other way
round.

-Graham

> 
> It would also make sure undefined behaviour while accessing
> out-of-bounds offsets in a structure with SVE types break consistently
> and early. :)
> 
> cheers,
> --renato