[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

Thu Jul 6 15:03:09 PDT 2017

[Sending again to list]

Hi Chris,

Responses inline...

On 6 July 2017 at 21:02, Chris Lattner via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> Thanks for sending this out Graham.  Here are some comments:
>
> This is a clever approach to unifying the two concepts, and I think that the approach is basically reasonable.  The primary problem that this will introduce is:
>
> 1) Almost anything touching (e.g. transforming) vector operations will have to be aware of this concept.  Given a first class implementation of SVE, I don’t see how that’s avoidable though, and your extension of VectorType is sensible.

Yes, however we have found that the vast majority of vector transforms
don't need any modification to deal with scalable types. There are
obviously exceptions, things like analysing shuffle vector masks for
specific patterns etc.

>
> 2) This means that VectorType is sometimes fixed size, and sometime unknowable.  I don’t think we have an existing analog for that in the type system.
>
> Is this type a first class type?  Can you PHI them, can you load/store them, can you pass them as function arguments without limitations?  If not, that is a serious problem.  How does struct layout with a scalable vector in it work?  What does an alloca of one of them look like?  What does a spill look like in codegen?
Yes, as an extension to VectorType they can be manipulated and passed
around like normal vectors, load/stored directly, phis, put in llvm
structs etc. Address computation generates expressions in terms vscale
and it seems to work well.
>
> I think that a target-specific type (e.g. like we have X86_mmx) is the only reasonable alternative.  A subclass of VectorType is just another implementation approach of your design above.  This is assuming that scalable vectors are really first class types.
>
> The pros and cons of a separate type is that it avoids you having to touch everything that touches VectorTypes, and if it turns out that the code that needs to handle normal SIMD and scalable SIMD vectors is different, then it is a win to split them into two types.  If, on the other hand, most code would treat the two types similarly, then it is better to just have one type.

Fortunately the latter case is exactly what we've found. Most
operations on vectors are not actually concerned with their absolute
size, and more usually concerned with relative sizes if anything.
>
> The major concern I have here is that I’m not sure how scalable vectors can be considered to be first class types, given that we don’t know their size.  If they can’t be put in an LLVM struct (for example), then this would pose a significant problem with your current approach.  It would be a huge problem if VectorType could be in structs in some cases, but not others.
We can have them as first class types but as you say it does require
us to be careful with reasoning about their sizes. In practice there
are architectural limits on the sizes of vectors, so it's possible to
have an upper bound on the size. However to be completely accurate,
type sizes in LLVM probably need to have some symbolic representation
such that we can reason about their sizes in terms of, essentially,
the vscale constant. The other potential avenue is to make all type
size queries in LLVM return optional values. We haven't implemented
either of these and we haven't yet hit an issue, not to say there
isn't one. I think most of the uses of querying type sizes are to
compare against other type sizes, so relative comparisons still work
even with scalable types. This area is something we want some
community input to build consensus on though.

>> With a scalable vector type defined, we now need a way to generate addresses for
>> consecutive vector values in memory and to be able to create basic constant
>> vector values.
>>
>> For address generation, the `vscale` constant is added to represent the runtime
>> value of `n` in `<n x m x type>`.
>
> This should probably be an intrinsic, not an llvm::Constant.  The design of llvm::Constant is already wrong: it shouldn’t have operations like divide, and it would be better to not contribute to the problem.
Could you explain your position more on this? The Constant
architecture has been a very natural fit for this concept from our
perspective.
>
>> Multiplying `vscale` by `m` and the number of
>> bytes in `type` gives the total length of a scalable vector, and the backend
>> can pattern match to the various load and store instructions in SVE that
>> automatically scale with vector length.
>
> It is fine for the intrinsic to turn into a target specific ISD node in selection dag to allow your pattern matching.
>

>
>> How do we spill/fill scalable registers on the stack?
>> -----------------------------------------------------
>>
>> SVE registers have a (partially) unknown size at build time and their associated
>> fill/spill instructions require an offset that is implicitly scaled by the
>> vector length instead of bytes or element size. To accommodate this we
>> created the concept of Stack Regions that are areas on the stack associated
>> with specific data types or register classes.
>
> Ok, that sounds complicated, but can surely be made to work.  The bigger problem is that there are various LLVM IR transformations that want to put registers into memory.  All of these will be broken with this sort of type.
Could you give an example?

Thanks for taking the time to review this,
Amara