[PATCH] D134635: [AMDGPU][GlobalISel] Add Shift/Shufflevector Combine

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 27 02:54:15 PDT 2022


Pierre-vh added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:336-338
+  // Fold (G_LSHR (G_BITCAST (G_SHUFFLE_VECTOR %a, %b, shufflemask(1, ?))), K)
+  // into a simple G_BITCAST(%a) if the shift amount (K) is 1/2 of the
+  // destination type and the vector types have 2 elements.
----------------
foad wrote:
> This doesn't make sense to me:
> - The original has 0s in its high bits but the replacement does not.
> - AMDGPU is little-endian so `(G_BITCAST (G_SHUFFLE_VECTOR %a, %b, shufflemask(1, ?)))` puts the //high// bits of %a in the //low// bits of the result - and then the G_LSHR shifts them out completely.
This is really a headache to understand for me, but IIRC endianness doesn't affect values themselves, but how they're represented in memory? (https://stackoverflow.com/questions/7184789/does-bit-shift-depend-on-endianness)

So I think that G_SHUFFLE_VECTOR (1,?) puts the low bits in the high bits of the results, and the LSHR puts them back in place.

```
%a:(<2 x s16>) = [0xDEAD, 0xBEEF]
%b:(<2 x s16>) = undef
%c = G_SHUFFLE_VECTOR %a, %b, shufflemask(1,0)
; %c = [0xBEEF, 0xDEAD] 
%d = G_LSHR %c, 16
; %d = [0x0000, 0xBEEF] 
```

So the lower 16 words stay the same, but indeed a difference before/after the combine is that the high words have 0000 if we leave the combine, but now they have "junk".

Maybe an additional trunc is needed in between or something?

What do you think @arsenm ?



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:375
+  B.setInstrAndDebugLoc(MI);
+  B.buildBitcast(MI.getOperand(0).getReg(), Reg);
+  MI.eraseFromParent();
----------------
foad wrote:
> LHS and RHS could have more than two elements, in which case I would expect this bitcast to fail machine verification. Can you add a test for that case?
Good catch, I'll enforce that the operands must be the same size as the dest.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134635/new/

https://reviews.llvm.org/D134635



More information about the llvm-commits mailing list