[llvm-dev] [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths
llvm-dev at lists.llvm.org
Tue Jun 5 08:23:00 PDT 2018
Just a few initial comments.
Graham Hunter <Graham.Hunter at arm.com> writes:
> ``<scalable x 4 x i32>`` and ``<scalable x 8 x i16>`` have the same number of
"scalable" instead of "scalable x."
> For derived types, a function (getSizeExpressionInBits) to return a pair of
> integers (one to indicate unscaled bits, the other for bits that need to be
> scaled by the runtime multiple) will be added. For backends that do not need to
> deal with scalable types, another function (getFixedSizeExpressionInBits) that
> only returns unscaled bits will be provided, with a debug assert that the type
> isn't scalable.
Can you explain a bit about what the two integers represent? What's the
"unscaled" part for?
The name "getSizeExpressionInBits" makes me think that a Value
expression will be returned (something like a ConstantExpr that uses
vscale). I would be surprised to get a pair of integers back. Do
clients actually need constant integer values or would a ConstantExpr
sufffice? We could add a ConstantVScale or something to make it work.
> Comparing two of these sizes together is straightforward if only unscaled sizes
> are used. Comparisons between scaled sizes is also simple when comparing sizes
> within a function (or across functions with the inherit flag mentioned in the
> changes to the type), but cannot be compared otherwise. If a mix is present,
> then any number of unscaled bits will not be considered to have a greater size
> than a smaller number of scaled bits, but a smaller number of unscaled bits
> will be considered to have a smaller size than a greater number of scaled bits
> (since the runtime multiple is at least one).
If we went the ConstantExpr route and added ConstantExpr support to
ScalarEvolution, then SCEVs could be compared to do this size
comparison. We have code here that adds ConstantExpr support to
ScalarEvolution. We just didn't know if anyone else would be interested
in it since we added it solely for our Fortran frontend.
> We have added an experimental `vscale` intrinsic to represent the runtime
> multiple. Multiplying the result of this intrinsic by the minimum number of
> elements in a vector gives the total number of elements in a scalable vector.
I think this may be a case where added a full-fledged Instruction might
be worthwhile. Because vscale is intimately tied to addressing, it
seems like things such as ScalarEvolution support will be important. I
don't know what's involved in making intrinsics work with
ScalarEvolution but it seems strangely odd that a key component of IR
computation would live outside the IR proper, in the sense that all
other fundamental addressing operations are Instructions.
> For constants consisting of a sequence of values, an experimental `stepvector`
> intrinsic has been added to represent a simple constant of the form
> `<0, 1, 2... num_elems-1>`. To change the starting value a splat of the new
> start can be added, and changing the step requires multiplying by a splat.
This is another case where an Instruction might be better, for the same
reasons as with vscale.
Also, "iota" is the name Cray has traditionally used for this operation
as it is the mathematical name for the concept. It's also used by C++
and go and so should be familiar to many people.
> Future Work
> Intrinsics cannot currently be used for constant folding. Our downstream
> compiler (using Constants instead of intrinsics) relies quite heavily on this
> for good code generation, so we will need to find new ways to recognize and
> fold these values.
As above, we could add ConstantVScale and also ConstantStepVector (or
ConstantIota). They won't fold to compile-time values but the
expressions could be simplified. I haven't really thought through the
implications of this, just brainstorming ideas. What does your
downstream compiler require in terms of constant support. What kinds of
queries does it need to do?
More information about the llvm-dev