[PATCH] D134635: [AMDGPU][GlobalISel] Add Shift/Shufflevector Combine

Tue Sep 27 01:54:20 PDT 2022

foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:336-338
+  // Fold (G_LSHR (G_BITCAST (G_SHUFFLE_VECTOR %a, %b, shufflemask(1, ?))), K)
+  // into a simple G_BITCAST(%a) if the shift amount (K) is 1/2 of the
+  // destination type and the vector types have 2 elements.
----------------
This doesn't make sense to me:
- The original has 0s in its high bits but the replacement does not.
- AMDGPU is little-endian so `(G_BITCAST (G_SHUFFLE_VECTOR %a, %b, shufflemask(1, ?)))` puts the //high// bits of %a in the //low// bits of the result - and then the G_LSHR shifts them out completely.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:367
+
+  // We can just replace this with a bitcast of the LHS of the ShiftOp.
+  Reg = LHS;
----------------
"... of the SHUFFLE op"?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:375
+  B.setInstrAndDebugLoc(MI);
+  B.buildBitcast(MI.getOperand(0).getReg(), Reg);
+  MI.eraseFromParent();
----------------
LHS and RHS could have more than two elements, in which case I would expect this bitcast to fail machine verification. Can you add a test for that case?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134635/new/

https://reviews.llvm.org/D134635