[PATCH] D140649: [AArch64][SelectionDAG] Eliminates redundant zero-extension for 32-bit popcount

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 5 20:01:45 PST 2023


efriedma added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/arm64-popcnt.ll:43
 ; CHECK-NEXT:    // kill: def $d0 killed $d0 def $q0
-; CHECK-NEXT:    fmov w8, s0
-; CHECK-NEXT:    fmov d0, x8
 ; CHECK-NEXT:    cnt.8b v0, v0
 ; CHECK-NEXT:    uaddlv.8b h0, v0
----------------
Allen wrote:
> efriedma wrote:
> > This doesn't appear to be equivalent.
> Thanks, I find the **%4:fpr32 = COPY %1.ssub:fpr128** will be eliminated in pass SIMPLE REGISTER COALESCING with this change. but I don't sure the elimination is fine?
> 
> ```
> # *** IR Dump After Live Interval Analysis (liveintervals) ***:
> # Machine code for function cnt32_advsimd_1: NoPHIs, TracksLiveness
> Function Live Ins: $d0 in %0
> 
> 0B	bb.0 (%ir-block.0):
> 	  liveins: $d0
> 16B	  %0:fpr64 = COPY $d0
> 32B	  undef %1.dsub:fpr128 = COPY %0:fpr64
> 48B	  %4:fpr32 = COPY %1.ssub:fpr128
> 64B	  %5:fpr64 = SUBREG_TO_REG 0, %4:fpr32, %subreg.ssub
> 80B	  %6:fpr64 = CNTv8i8 %5:fpr64
> 96B	  %7:fpr16 = UADDLVv8i8v %6:fpr64
> 112B	  undef %8.hsub:fpr128 = COPY %7:fpr16
> 128B	  %10:gpr32all = COPY %8.ssub:fpr128
> 144B	  $w0 = COPY %10:gpr32all
> 160B	  RET_ReallyLR implicit killed $w0
> ```
> 
> * After the **SIMPLE REGISTER COALESCING**.
> ```
> Function Live Ins: $d0 in %0
> 
> 0B	bb.0 (%ir-block.0):
> 	  liveins: $d0
> 16B	  undef %1.dsub:fpr128 = COPY $d0
> 80B	  %6:fpr64 = CNTv8i8 %1.dsub:fpr128
> 96B	  undef %8.hsub:fpr128 = UADDLVv8i8v %6:fpr64
> 128B	  %10:gpr32all = COPY %8.ssub:fpr128
> 144B	  $w0 = COPY %10:gpr32all
> 160B	  RET_ReallyLR implicit killed $w0
> ```
See D127154 for a similar situation.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140649/new/

https://reviews.llvm.org/D140649



More information about the llvm-commits mailing list