[PATCH] D104921: [RISCV] Lower more BUILD_VECTOR sequences to RVV's VID

Tue Jul 13 03:37:43 PDT 2021

frasercrmck marked 2 inline comments as done.
frasercrmck added inline comments.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:1569
+  if (auto SimpleVID = isSimpleVIDSequence(Op)) {
+    // Only emit VIDs with suitably-small steps/addends. We use imm6 is a
+    // threshold since it's the immediate value many RVV instructions accept.
----------------
craig.topper wrote:
> Isn't it simm5?
At least I was close! Fixed, cheers.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:1584
+        unsigned Opcode = ISD::MUL;
+        if (Step > 0 && isPowerOf2_64(Step)) {
+          Opcode = ISD::SHL;
----------------
craig.topper wrote:
> Step == -1 -> negate?
Yep, thanks! I now catch steps where the absolute value is a power-of-two and use a reverse SUB to account. By combining it with the `Addend` handling we see better codegen below. Doing `ISD::SUB` followed `ISD::ADD` isn't combined for reasons explained below.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll:179
+; RV32-NEXT:    addi a0, zero, -1
+; RV32-NEXT:    vmadd.vx v28, a0, v25
 ; RV32-NEXT:    addi a0, zero, 12
----------------
craig.topper wrote:
> This can I think be
> 
> ```
> vid.v v28
> vrsub.vx v28, 4
> ```
Yeah, thanks. I actually saw that codegen when at first I was doing scalable-vector post-VID adjustments (`ISD::ADD` etc.). This was wrong, however, as we dropped the fixed vector length info. When I corrected it to adjustments after converting from the scalable container type I saw worse code generation as the fixed-vector `ISD` nodes are immediately custom-lowered to `RISCVISD` counterparts without going through any DAGCombining.

This desired code generation is now "fixed" by doing "negate + addend" folding above while working on your earlier suggestion for negative steps.

I'm not a huge fan of having to do our own optimizations at this level. Perhaps if we did this `BUILD_VECTOR` optimization earlier in combining we'd save some bother, but we'd also have to make `SHUFFLE_VECTOR` happen earlier to allow its indices to go through the same process.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104921/new/

https://reviews.llvm.org/D104921