[llvm-branch-commits] [llvm] [AMDGPU] Precommit si-fold-bitmask.mir (PR #131310)

Fri Mar 14 04:58:27 PDT 2025

Pierre-vh wrote:

> > GlobalISel unfortunately needs it. We can end up with things like a `G_LSHR` with the shift amount being zext'd, and they're both lowered independently so we have a `s_and_b32` of the shift amount.
> 
> It should always be post legalize / post regbankselect combinable. Things are strictly more difficult after selection

The main issue I was having was with code that had <32 bit arguments in registers.
We'd have
```
%0(s32) = COPY $sgpr0
%1(s16) = G_TRUNC %0
%2(s32) = G_ZEXT %1
```
Then %2 being used as the shift amount. We can't eliminate the zext/trunc because the generic opcode has no mention of reading only the lower bits, AFAIK. I tried experimenting with multiple approaches but I didn't find anything better than doing it in SIFoldOperand

https://github.com/llvm/llvm-project/pull/131310