[PATCH] D120899: [RISCV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32

Thu Mar 3 17:15:39 PST 2022

craig.topper added a comment.

This isn't the only problem with this code. It's broken for CPUs that implement this sentence from the spec. "this permits an implementation to set vl = ceil(AVL / 2) for VLMAX < AVL < 2*VLMAX".

As a concrete example:
If SEW=64 VLMAX is 8 and AVL is 9. The implementation is allowed to return 5 for the VL.
If we multiply the 9 by 2 we create an AVL of 18 for the SEW=32. The implementation is would be allowed to return 9 for the VL. We need it to return 10 to be 2x5 to match the SEW=64 VL.

The only way I see out of this is to insert a SEW=64 vsetvli to explicitly create a VL less than or equal to the SEW=64 VLMAX, then multiply that by 2. This will fix both the overflow you identified and the case I just described.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120899/new/

https://reviews.llvm.org/D120899