[PATCH] D120899: [RISCV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32

Fri Mar 4 00:06:57 PST 2022

lhtin added a comment.

In D120899#3358636 <https://reviews.llvm.org/D120899#3358636>, @craig.topper wrote:

> This isn't the only problem with this code. It's broken for CPUs that implement this sentence from the spec. "this permits an implementation to set vl = ceil(AVL / 2) for VLMAX < AVL < 2*VLMAX".
>
> As a concrete example:
> If SEW=64 VLMAX is 8 and AVL is 9. The implementation is allowed to return 5 for the VL.
> If we multiply the 9 by 2 we create an AVL of 18 for the SEW=32. The implementation is would be allowed to return 9 for the VL. We need it to return 10 to be 2x5 to match the SEW=64 VL.
>
> The only way I see out of this is to insert a SEW=64 vsetvli to explicitly create a VL less than or equal to the SEW=64 VLMAX, then multiply that by 2. This will fix both the overflow you identified and the case I just described.

Thanks for your review. You are right. I will try to solve both problems based on your comments.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120899/new/

https://reviews.llvm.org/D120899