[PATCH] D153490: [RISCV] Properly handle partial writes in isConvertibleToVMV_V_V.

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jun 25 14:19:12 PDT 2023


craig.topper added a comment.

In D153490#4447068 <https://reviews.llvm.org/D153490#4447068>, @fakepaper56 wrote:

> In D153490#4442663 <https://reviews.llvm.org/D153490#4442663>, @craig.topper wrote:
>
>> @reames FYI, I think this entire transformation may be conflating tail agnostic and tail undefined.
>
> LGTM. But I am curious what does the "tail undefined" mean?

Philip tried to document that here https://reviews.llvm.org/D152937

The vector spec defines tail agnostic as being all ones or undisturbed (and an extra case for instructions that write masks).

If the user did something like

  //vsetvli vlmax
  vmv.v.i v8, v8, -1
  //vsetvli less than vlmax with tail agnostic.
  vadd.vi v8, v16, 1

They could expect the tail elements of vadd.vi to be all ones regardless of how hardware treats tail agnostic. The tail was preloaded with all ones. Tail agnostic for the vadd must either keep them undisturbed or replace them will all ones. I don't know if anyone would really do this. The spec discourages using the tail, but it defined how it behaves so a user could still depend on it.

Tail undefined is the concept that the upper bits can really be any value.

This transform uses the VL of a producing instruction to limit the VL of a copy. This means we don't copy the tail elements of the producing operation. So they're undefined. All we checked was the policy was "tail agnostic". We don't know if the user expected all ones.

For autovectorized code we never create anything where we expect the all ones value.

The C RVV intrinsic interface is way less clear on this. Any intrinsic that doesn't have a passthru operand, is "tail undefined" because the user has no control of register allocation. For intrinsics that "tail agnostic" but have an operand that is tied to the dest like maskedoff for masked instruction or the one of the sources for fma instructions, the user does have control of what register the tail comes from so could depend on the behavior.

I think we should document in the intrinsics API that agnostic really means undefined.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153490/new/

https://reviews.llvm.org/D153490



More information about the llvm-commits mailing list