[PATCH] D123394: [CodeGen] Late cleanup of redundant address/immediate definitions.

Wed Apr 20 07:51:08 PDT 2022

jonpa updated this revision to Diff 423903.
jonpa added a comment.
Herald added subscribers: armkevincheng, eric-k256, dmgreen, atanasyan, jrtc27, kbarton, nemanjai, sdardis, qcolombet.
Herald added a reviewer: sjarus.

Some minor fixes, but mostly progress on updating of tests.

Most tests looks to be improved, but I am unsure about these test updates:

- ARM/ifcvt-branch-weight-bug.ll.  I see one less if-conversion taking place:

  Ifcvt: function (0) 'test'
  MeetIfcvtSizeLimit(TCycle=1, FCycle=2, TExtra=0, FExtra=0) = 1
  Ifcvt (Diamond): %bb.3 (T:7,F:4) succeeded!
  Ifcvt (Triangle): %bb.1 (T:5,F:2) succeeded!
  =>
  Ifcvt: function (0) 'test'
  Ifcvt (Triangle): %bb.1 (T:5,F:2) succeeded!

The test says it is testing a bug in IfConverterTriangle, so as the Triangle case is still being optimized, I updated the test in the hopes that it is still valid..?

- ARM/jump-table-islands.ll

Many less mvn/mov instructions but CFG is changed so not sure how to update the test...

These I don't understand / look strange:

- X86/masked_load.ll
- X86/oddshuffles.ll
- PowerPC/fp-strict-conv-f128.ll
- PowerPC/ppcf128-constrained-fp-intrinsics.ll

More tests to be updated also on AMDGPU(~dozen) and RISCV(~2 dozens).

I have also looked into a few examples on SystemZ (the ones listed before). I found that in one case the reason for the multiple VGM 0 (load 0 into vector reg), is that it was originally just one such immediate load copied into other registers. The regcoalescer however then later rematerialized the COPYs with VGBMs, since that instruction is marked with isAsCheapAsAMove. In another case it was two different contexts entirely where the loaded zero was used, and it just happened to be loaded into the same reg with a reuse opportunity. And of course for the LA:s (frame address anchors), these come from elimination of FrameIndices during PEI.

Given this, and the fact that other targets also seem to benefit, it seems that this is not something really lacking particularly in the SystemZ backend, but something more of a general cleanup opportunity. Of course, it may be that these constant loads are not that much to worry about in the first place performance wise, but it wouldn't hurt to improve upon...

I also realized that there is room for even further improvement: An identical immediate load could actually sometimes be reused even in cases where a different register is used to hold the value. Since the second immediate load actually (at least in cases) started out as a COPY of the first one, this may even be seen as a kind of COPY propagation. I wonder if it would be worth trying to extend the Machine Copypropagation pass somehow to help with this? Or would it be better to keep that pass as simple as possible and instead have another pass on the side for this, perhaps with some shared code?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123394/new/

https://reviews.llvm.org/D123394

Files:
  llvm/lib/CodeGen/PrologEpilogInserter.cpp
  llvm/test/CodeGen/AArch64/framelayout-sve.mir
  llvm/test/CodeGen/AArch64/strict-fp-int-promote.ll
  llvm/test/CodeGen/AArch64/sve-calling-convention-mixed.ll
  llvm/test/CodeGen/AArch64/sve-ld1r.mir
  llvm/test/CodeGen/AArch64/sve-ldstnt1.mir
  llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll
  llvm/test/CodeGen/ARM/ifcvt-branch-weight-bug.ll
  llvm/test/CodeGen/ARM/machine-outliner-calls.mir
  llvm/test/CodeGen/ARM/reg_sequence.ll
  llvm/test/CodeGen/BPF/objdump_cond_op_2.ll
  llvm/test/CodeGen/Mips/llvm-ir/lshr.ll
  llvm/test/CodeGen/Mips/llvm-ir/shl.ll
  llvm/test/CodeGen/PowerPC/aix-csr-vector-extabi.ll
  llvm/test/CodeGen/PowerPC/cgp-select.ll
  llvm/test/CodeGen/PowerPC/fast-isel-branch.ll
  llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll
  llvm/test/CodeGen/PowerPC/ppcf128-constrained-fp-intrinsics.ll
  llvm/test/CodeGen/SystemZ/frame-28.mir
  llvm/test/CodeGen/Thumb2/mve-fpclamptosat_vec.ll
  llvm/test/CodeGen/Thumb2/mve-vst4.ll
  llvm/test/CodeGen/X86/2008-04-09-BranchFolding.ll
  llvm/test/CodeGen/X86/2008-04-16-ReMatBug.ll
  llvm/test/CodeGen/X86/AMX/amx-across-func.ll
  llvm/test/CodeGen/X86/AMX/amx-spill-merge.ll
  llvm/test/CodeGen/X86/masked_load.ll
  llvm/test/CodeGen/X86/oddshuffles.ll
  llvm/test/CodeGen/X86/popcnt.ll
  llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll
  llvm/test/CodeGen/X86/sdiv_fix_sat.ll
  llvm/test/CodeGen/X86/vec_shift5.ll
  llvm/test/CodeGen/XCore/scavenging.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D123394.423903.patch
Type: text/x-patch
Size: 78979 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220420/96de44f3/attachment-0001.bin>