[PATCH] D57504: RFC: Prototype & Roadmap for vector predication in LLVM

Sun Feb 2 10:25:29 PST 2020

programmerjake added a comment.

In D57504#1853591 <https://reviews.llvm.org/D57504#1853591>, @rkruppe wrote:

> (This was gonna be an inline comment on D69891 <https://reviews.llvm.org/D69891>, but it's more of a general conceptual issue, so I decided to move it here.)
>
> Right now, LangRef changes in D69891 <https://reviews.llvm.org/D69891> describe the restriction on the EVL value as this:
>
> > The explicit vector length (%evl) is only effective if it is non-negative, and when that is the case, its value is in the range:
> > 
> >   0 <= %evl <= W,   where W is the vector length.
>
> The restriction is good, but this wording doesn't specify what happens when `%evl` is not in that range. Some sort of undefined behavior, I assume, but this must be explicitly stated, especially since there are many ways in which it could be undefined. I don't recall previous discussion of this detail and I don't know what you have in mind, but some possibilities I see:
>
> 1. The instruction has capital-UB undefined behavior. This gives the greatest flexibility to backends (e.g., allows generation of code that traps if %evl is too large) but I don't know of any architecture that needs this much flexibility and it constrains IR optimizations (code hoisting etc.) the most.
> 2. The instruction returns poison (i.e., all result lanes are poison) and all lanes are (potentially, non-deterministically) enabled regardless of the mask parameter. This is less restrictive for IR optimizations (e.g., integer `vp.add` can unconditionally be speculated) but still allows backends to unconditionally use SETVL-style "stripmining" instructions that are not generally consistent (across architectures) w.r.t. which lanes become active when a vector length greater than the hardware vector length is requested.
> 3. `%EVLmask` is undef, that's all. As consequence, lanes disabled by the `%mask` argument definitely stay disabled, but for other lanes (where the mask has a 1 or an undef) it's non-deterministic whether they are active. As far as I can see, this has pretty much the same implications for IR optimizations and backends (excluding hypothetical pathological architectures) but is less of a special case to specify and directly captures the diversity of hardware behavior that (presumably) motivates this restriction on EVL.
>
>   Off the cuff, I would suggest the last option.

We (Libre-SoC, provisionally renamed from Libre-RISCV) are currently building a processor that supports variable-length vector operations by having each operation specify the starting register in a flat register file, then relying on VL telling it how many elements to operate on, which, when divided by the number of elements per register, directly translates to the number of registers to operate on. So, if VL is out of bounds, the instructions can overwrite registers past the end of the range assigned by the register allocator and/or trap. This would probably force use of option #1 above, at least for our processor. Our ISA design is still incomplete, so we might add (or already have) a mechanism allowing use of option #2 or #3 if there is a sufficient reason (will have to see what the rest of Libre-SoC think).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57504/new/

https://reviews.llvm.org/D57504