[PATCH] D117680: [InstCombine] Fold bswap(shl(x, C)) -> and(x, 255)

Wed Jan 19 11:14:07 PST 2022

spatel added a comment.

In D117680#3255612 <https://reviews.llvm.org/D117680#3255612>, @craig.topper wrote:

> In D117680#3255573 <https://reviews.llvm.org/D117680#3255573>, @spatel wrote:
>
>> Pushing a logical shift after the bswap (and reversing direction) might get us most of what we need:
>> https://alive2.llvm.org/ce/z/2zveR6
>> That should allow the existing demanded bits fold to trigger in the simplest cases if I'm seeing it correctly.
>
> That wouldn't help with the (bswap (and)) though would it? But my computeKnownBits suggestion would allow the bswap to be reduced to a shift.

Right - knownbits would give us more flexibility on the single byte cases, but it wouldn't do anything for the patterns that deal with >1 byte? There's some overlap, but these could be independent patches.

I was hoping that pushing the shift to the end could trigger a narrowing transform on the bswap, but we miss that too:

  define i32 @_Z4loadILj3EEjPKh(i8* noundef %0) {
    %2 = bitcast i8* %0 to i24*
    %3 = load i24, i24* %2, align 1
    %4 = zext i24 %3 to i32
    %5 = call i32 @llvm.bswap.i32(i32 %4) ; can we do anything with this?
    %6 = lshr exact i32 %5, 8
    ret i32 %6
  }

  define i32 @_Z4loadILj2EEjPKh(i8* noundef %0) {
    %2 = bitcast i8* %0 to i16*
    %3 = load i16, i16* %2, align 1
    %4 = zext i16 %3 to i32
    %5 = call i32 @llvm.bswap.i32(i32 %4) ; bswap.i16
    %6 = lshr exact i32 %5, 16
    ret i32 %6
  }

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117680/new/

https://reviews.llvm.org/D117680