[llvm-dev] [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

David A. Greene via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 31 14:32:31 PDT 2018


Robin Kruppe <robin.kruppe at gmail.com> writes:

>> Yes, the "is this supported" question is common.  Isn't the whole point
>> of VPlan to get the "which one is better" question answered for
>> vectorization?  That would be necessarily tied to the target.  The
>> questions asked can be agnostic, like the target-agnostics bits of
>> codegen use, but the answers would be target-specific.
>
> Just like the old loop vectorizer, VPlan will need a cost model that
> is based on properties of the target, exposed to the optimizer in the
> form of e.g. TargetLowering hooks. But we should try really hard to
> avoid having a hard distinction between e.g. predication- and VL-based
> loops in the VPlan representation. Duplicating or triplicating
> vectorization logic would be really bad, and there are a lot of
> similarities that we can exploit to avoid that. For a simple example,
> SVE and RVV both want the same basic loop skeleton: strip-mining with
> predication of the loop body derived from the induction variable.
> Hopefully we can have a 99% unified VPlan pipeline and most
> differences can be delegated to the final VPlan->IR step and the
> respective backends.
>
> + Diego, Florian and others that have been discussing this previously

If VL and predication are represented the same way, how does VPlan
distinguish between the two?  How does it cost code generation just
using predication vs. code generation using a combination of predication
and VL?

Assuming it can do that, do you envision vector codegen would emit
different IR for VL+predication (say, using intrinsics to set VL) vs. a
strictly predication-only-based plan?  If not, how does the LLVM backend
know to emit code to manipulate VL in the former case?

I don't need answers to these questions right now as VL is a separate
issue and I don't want this thread to get bogged down in it.  But these
are questions that will come up if/when we tackle VL.

> At some point in the future I will propose something in this space to
> support RISC-V vectors, but we'll cross that bridge when we come to
> it.

Sounds good.

> Yes, for RISC-V we definitely need vscale to vary a bit, but are fine
> with limiting that to function boundaries. The use case is *not*
> "changing how large vectors are" in the middle of a loop or something
> like that, which we all agree is very dubious at best. The RISC-V
> vector unit is just very configurable (number of registers, vector
> element sizes, etc.) and this configuration can impact how large the
> vector registers are. For any given vectorized loop next we want to
> configure the vector unit to suit that piece of code and run the loop
> with whatever register size that configuration yields. And when that
> loop is done, we stop using the vector unit entirely and disable it,
> so that the next loop can use it differently, possibly with a
> different register size. For IR modeling purposes, I propose to
> enlarge "loop nest" to "function" but the same principle applies, it
> just means all vectorized loops in the function will have to share a
> configuration.
>
> Without getting too far into the details, does this make sense as a
> use case?

I think so.  If changing vscale has some important advantage (saving
power?), I wonder how the compiler will deal with very large functions.
I have seen some truly massive Fortran subroutines with hundreds of loop
nests in them, possibly with very different iteration counts for each
one.

                           -David


More information about the llvm-dev mailing list