[PATCH] D155472: [DAG] Attempt shl narrowing in SimplifyDemandedBits (WIP)

Mon Jul 24 02:08:26 PDT 2023

RKSimon added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll:4710
-; GFX6-NEXT:    v_or_b32_e32 v2, v3, v2
-; GFX6-NEXT:    v_or_b32_e32 v0, v2, v0
-; GFX6-NEXT:    buffer_store_dword v0, off, s[0:3], 0
----------------
arsenm wrote:
> I haven't managed to spot where the 64-bit shift that got removed is, but getting rid of them is really good
Yes, a lot of the AMDGPU improvements in this patch and D146121 appear to be from better handling of i64 arithmetic.

================
Comment at: llvm/test/CodeGen/X86/atomic-rm-bit-test-64.ll:1226
 ; CHECK-NEXT:  # %bb.2: # %if.then
+; CHECK-NEXT:    movl %esi, %esi
 ; CHECK-NEXT:    movq (%rdi,%rsi,8), %rax
----------------
goldstein.w.n wrote:
> hmm?
we end up with zero_extend(truncate(assertzext(x))) in X86DAGToDAGISel which is too late to perform any combines to fold it all away, we'll need a peephole (or a workaround in getNode())

================
Comment at: llvm/test/CodeGen/X86/lsr-loop-exit-cond.ll:76
+; GENERIC-NEXT:    andl $-4, %r14d
+; GENERIC-NEXT:    movzbl 3(%rdi,%r14), %edi
 ; GENERIC-NEXT:    shll $24, %edi
----------------
goldstein.w.n wrote:
> Another here. It seems some transform that does `shr; AGEN` is breaking down a bit.
Yes, the problem we have is X86DAGToDAGISel::matchAddressRecursively isn't currently setup to properly see through zext extensions, we just have a few special cases we handle. Ideally the recursion would peek through zext nodes, and we'd hopefully get rid of promoteExtBeforeAdd entirely as well (sext is much less of a problem and easier to handle).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155472/new/

https://reviews.llvm.org/D155472