[PATCH] D129844: [InstCombine] Relax "(X & 0xFF00) + xx00 --> (X + xx00) & 0xFF00"

Fri Jul 15 03:58:50 PDT 2022

piotr added a subscriber: foad.
piotr added a comment.

The existing pattern can cause problems for backends, because it potentially obstructs the (base_addr+offset) pattern, which is not always possible to fix at the selection level (if the pattern exists across basic blocks for instance).

While the additional m_OneUse check fixes an important case for AMDGPU backend and should be fairly uncontentious, given the existing check, I think it is worth considering whether the transformation is useful at all.

I did a bit of an archaeology, and before https://reviews.llvm.org/rG080e6bc2050e28ae198d82f0e934ca7b4548c3b7 there was a comment suggesting that the pattern was a part of another fold. Apart from the two tests that got changed in that commit, only one more pattern fails if I remove the transformation entirely - perhaps that needs to be fixed in some other way (?).

I did not attempt to remove it, as I do not have the full picture of the impact of such a change, but I can prepare a patch and help in testing (e.g. on AMDGPU workloads) if this would be a desirable change to make.

Lastly, here's an example (courtesy of @foad) that shows worse code in another backend (X86) if run with the instcombine:

  define i8 @f(i8 *%ptr, i64 %idx) {
    %idx1 = and i64 %idx, -4
    %idx2 = add i64 %idx1, 16
    %gep = getelementptr i8, i8* %ptr, i64 %idx2
    %val = load i8, i8* %gep
    ret i8 %val
  }

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129844/new/

https://reviews.llvm.org/D129844