[PATCH] D57504: RFC: Prototype & Roadmap for vector predication in LLVM

Sun Feb 2 08:08:19 PST 2020

rkruppe added a comment.

(This was gonna be an inline comment on D69891 <https://reviews.llvm.org/D69891>, but it's more of a general conceptual issue, so I decided to move it here.)

Right now, LangRef changes in D69891 <https://reviews.llvm.org/D69891> describe the restriction on the EVL value as this:

> The explicit vector length (%evl) is only effective if it is non-negative, and when that is the case, its value is in the range:
> 
>   0 <= %evl <= W,   where W is the vector length.

The restriction is good, but this wording doesn't specify what happens when `%evl` is not in that range. Some sort of undefined behavior, I assume, but this must be explicitly stated, especially since there are many ways in which it could be undefined. I don't recall previous discussion of this detail and I don't know what you have in mind, but some possibilities I see:

1. The instruction has capital-UB undefined behavior. This gives the greatest flexibility to backends (e.g., allows generation of code that traps if %evl is too large) but I don't know of any architecture that needs this much flexibility and it constrains IR optimizations (code hoisting etc.) the most.
2. The instruction returns poison (i.e., all result lanes are poison) and all lanes are (potentially, non-deterministically) enabled regardless of the mask parameter. This is less restrictive for IR optimizations (e.g., integer `vp.add` can unconditionally be speculated) but still allows backends to unconditionally use SETVL-style "stripmining" instructions that are not generally consistent (across architectures) w.r.t. which lanes become active when a vector length greater than the hardware vector length is requested.
3. `%EVLmask` is undef, that's all. As consequence, lanes disabled by the `%mask` argument definitely stay disabled, but for other lanes (where the mask has a 1 or an undef) it's non-deterministic whether they are active. As far as I can see, this has pretty much the same implications for IR optimizations and backends (excluding hypothetical pathological architectures) but is less of a special case to specify and directly captures the diversity of hardware behavior that (presumably) motivates this restriction on EVL.

Off the cuff, I would suggest the last option.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57504/new/

https://reviews.llvm.org/D57504