[llvm-dev] [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths
David Greene via llvm-dev
llvm-dev at lists.llvm.org
Tue Jun 5 12:08:49 PDT 2018
Graham Hunter <Graham.Hunter at arm.com> writes:
>> Can you explain a bit about what the two integers represent? What's the
>> "unscaled" part for?
> 'Unscaled' just means 'exactly this many bits', whereas 'scaled' is 'this many bits
> multiplied by vscale'.
Right, but what do they represent? If I have <scalable 4 x i32> is "32"
"unscaled" and "4" "scaled?" Or is "128" "scaled?" Or something else?
I see you answered this below.
>> The name "getSizeExpressionInBits" makes me think that a Value
>> expression will be returned (something like a ConstantExpr that uses
>> vscale). I would be surprised to get a pair of integers back. Do
>> clients actually need constant integer values or would a ConstantExpr
>> sufffice? We could add a ConstantVScale or something to make it work.
> I agree the name is not ideal and I'm open to suggestions -- I was thinking of the two
> integers representing the known-at-compile-time terms in an expression:
> '(scaled_bits * vscale) + unscaled_bits'.
> Assuming the pair is of the form (unscaled, scaled), then for a type with a size known at
> compile time like <4 x i32> the size would be (128, 0).
> For a scalable type like <scalable 4 x i32> the size would be (0, 128).
> For a struct with, say, a <scalable 32 x i8> and an i64, it would be (64, 256).
> When calculating the offset for memory addresses, you just need to multiply the scaled
> part by vscale and add the unscaled as is.
Ok, now I understand what you're getting at. A ConstantExpr would
encapsulate this computation. We alreay have "non-static-constant"
values for ConstantExpr like sizeof and offsetof. I would see
VScaleConstant in that same tradition. In your struct example,
getSizeExpressionInBits would return:
add(mul(256, vscale), 64)
Does that satisfy your needs?
Is there anything about vscale or a scalable vector that requires a
minimum bit width? For example, is this legal?
<scalable 1 x double>
I know it won't map to an SVE type. I'm simply curious because
traditionally Cray machines defined vectors in terms of
machine-dependent "maxvl" with an element type, so with the above vscale
would == maxvl. Not that we make any such things anymore. But maybe
someone else does?
>> If we went the ConstantExpr route and added ConstantExpr support to
>> ScalarEvolution, then SCEVs could be compared to do this size
>> comparison. We have code here that adds ConstantExpr support to
>> ScalarEvolution. We just didn't know if anyone else would be interested
>> in it since we added it solely for our Fortran frontend.
> We added a dedicated SCEV expression class for vscale instead; I suspect it works
> either way.
Yes, that's probably true. A vscale SCEV is less invasive.
> We've tried it as both an instruction and as a 'Constant', and both work fine with
> ScalarEvolution. I have not yet tried it with the intrinsic.
vscale as a Constant is interesting. It's a target-dependent Constant
like sizeof and offsetof. It doesn't have a statically known value and
maybe isn't "constant" across functions. So it's a strange kind of
Ultimately whatever is easier for LLVM to analyze in the long run is
best. Intrinsics often block optimization. I don't know whether vscale
would be "eaiser" as a Constant or an Instruction.
>> As above, we could add ConstantVScale and also ConstantStepVector (or
>> ConstantIota). They won't fold to compile-time values but the
>> expressions could be simplified. I haven't really thought through the
>> implications of this, just brainstorming ideas. What does your
>> downstream compiler require in terms of constant support. What kinds of
>> queries does it need to do?
> It makes things a little easier to pattern match (just looking for a constant to start
> instead of having to match multiple different forms of vscale or stepvector multiplied
> and/or added in each place you're looking for them).
Ok. Normalization could help with this but I certainly understand the
> The bigger reason we currently depend on them being constant is that code generation
> generally looks at a single block at a time, and there are several expressions using
> vscale that we don't want to be generated in one block and passed around in a register,
> since many of the load/store addressing forms for instructions will already scale properly.
This is kind of like X86 memop folding. If a load has multiple uses, it
won't be folded, on the theory that one load is better than many folded
loads. If a load has exactly one use, it will fold. There's explicit
predicate code in the X86 backend to enforce this requirement. I
suspect if the X86 backend tried to fold a single load into multiple
places, Bad Things would happen (needed SDNodes might disappear, etc.).
Codegen probably doesn't understand non-statically-constant
ConstantExprs, since sizeof of offsetof can be resolved by the target
before instruction selection.
> We've done this downstream by having them be Constants, but if there's a good way
> of doing them with intrinsics we'd be fine with that too.
If vscale/stepvector as Constants works, it seems fine to me.
More information about the llvm-dev