[PATCH] D110339: SelectionDAGBuilder: Improve canonicalization by not swapping branch targets

Matthias Braun via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 23 08:53:40 PDT 2021


MatzeB created this revision.
MatzeB added reviewers: bogner, craig.topper, rnk, RKSimon, Jiangning, mcrosier, lebedev.ri, efriedma.
Herald added subscribers: foad, frasercrmck, wenlei, kerbowa, luismarques, apazos, sameer.abuasal, steven.zhang, pengfei, s.egerton, Jim, jocewei, PkmX, arphaman, the_o, brucehoult, MartinMosbeck, rogfer01, atanasyan, edward-jones, zzheng, jrtc27, niosHD, sabuasal, sunfish, simoncook, johnrusso, rbar, asb, fedor.sergeev, kbarton, hiraditya, kristof.beyls, jgravelle-google, arichardson, sbc100, nhaehnle, jvesely, nemanjai, sdardis, dylanmckay, jyknight, dschuff, arsenm, qcolombet, jholewinski.
MatzeB requested review of this revision.
Herald added subscribers: llvm-commits, MaskRay, aheejin.
Herald added a project: LLVM.

Introduces a TargetLowerig::optimizeFallthroughsEarly() switch to
stop SelectionDAG flipping jump targets to create fallthroughs at
early if desired.

Motivation
----------

In most targets comparison instructions produce results for multiple
relations, so a program like `if (x == C) { ... } else if (x > C) { ... }`
can use a single compare instructions for 2 conditional branches. This pattern
is widespread in one of our systems (a decref operation with support for
immortal objects).

Machine-CSE would not consistently trigger, because:
SelectionDAG would often optimizes for fallthroughs,
flip true/false and negate the condition:

  x > C
  --> !(x > C) with true/false branches flipped
  --> X <= C
  --> X < (C + 1)  canonicalization

So we end up with a different constant inhibiting CSE!

However with MachineBlockPlacement optimzing control flow
and fallthroughs anyway there is no real benefit from
SelectionDAG optimizing early when it negatively affects
canonicalization.

Target Specifics
----------------

I did not apply the change to most targets:

- ARM: IfConverter pass currently relies on fallthroughs existing
- AArch64: Has a custom AArch64ConditionOptimizer pass to CSE comparison. Strangely the unit tests do not show consistent behavior with and without these changes. Glancing at the code I wonder if the algorithm only checks control flow among the "true" direction of a conditional branch and misses opportunities in the "false" direction.
- PowerPC+Mips: It seems in some tests `not` instructions are inserted when optimizing for fallthrough instead of flipping true/false.
- BPF+WebAssembly: The TargetInstrInfo::analyzeBranch implementation appears to be incomplete so MachineBlockPlacement cannot optimize for fallthroughs.
- Other targets: Test regressions that I did not understand.

The changes are enabled for: X86, RISCV, Lanai, XCore and by default
for new targets.

Test Changes
------------

There is a lot of test churn. Most is benign because of different
canonicalization (which in fact makes it more likely to see the same
constant in LLVM-IR and assembly now).

There are a number of tests containing unoptimized conditional branches
jumping on true/false, getting better or worse. But those are just
artifacts of bugpoint reduction or manual construction and
won't happen in practice after InstCombine:

est/CodeGen/X86/callbr-asm-blockplacement.ll
test/CodeGen/X86/pr29170.ll
test/CodeGen/X86/pr46585.ll
test/CodeGen/X86/setcc-freeze.ll
test/CodeGen/X86/2008-04-17-CoalescerBug.ll
test/CodeGen/X86/2007-01-13-StackPtrIndex.ll
test/CodeGen/X86/2007-03-01-SpillerCrash.ll
test/CodeGen/X86/2007-12-18-LoadCSEBug.ll
test/CodeGen/X86/legalize-shift-64.ll
test/CodeGen/X86/avx-cmp.ll
test/CodeGen/X86/setcc-freeze.ll
test/CodeGen/X86/shrink-compare-pgso.ll

These tests show improved codegen because duplicated CMPs get
eliminated:

test/CodeGen/X86/speculative-load-hardening-indirect.ll
test/CodeGen/X86/switch-density.ll
test/CodeGen/X86/cse-cmp.ll  (new test)

These appear to improve because of different normalization:

test/CodeGen/X86/cmp-bool.ll
test/CodeGen/X86/fp128-select.ll
test/CodeGen/X86/xmulo.ll


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D110339

Files:
  llvm/include/llvm/CodeGen/TargetLowering.h
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.h
  llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
  llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
  llvm/lib/Target/ARM/ARMISelLowering.h
  llvm/lib/Target/AVR/AVRISelLowering.h
  llvm/lib/Target/BPF/BPFISelLowering.cpp
  llvm/lib/Target/BPF/BPFISelLowering.h
  llvm/lib/Target/Hexagon/HexagonISelLowering.h
  llvm/lib/Target/Mips/MipsISelLowering.cpp
  llvm/lib/Target/Mips/MipsISelLowering.h
  llvm/lib/Target/NVPTX/NVPTXISelLowering.h
  llvm/lib/Target/PowerPC/PPCISelLowering.h
  llvm/lib/Target/Sparc/SparcISelLowering.cpp
  llvm/lib/Target/Sparc/SparcISelLowering.h
  llvm/lib/Target/SystemZ/SystemZISelLowering.h
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h
  llvm/test/CodeGen/Lanai/sub-cmp-peephole.ll
  llvm/test/CodeGen/RISCV/codemodel-lowering.ll
  llvm/test/CodeGen/RISCV/double-br-fcmp.ll
  llvm/test/CodeGen/RISCV/double-previous-failure.ll
  llvm/test/CodeGen/RISCV/float-br-fcmp.ll
  llvm/test/CodeGen/RISCV/half-br-fcmp.ll
  llvm/test/CodeGen/RISCV/jumptable.ll
  llvm/test/CodeGen/RISCV/shrinkwrap.ll
  llvm/test/CodeGen/X86/2006-08-21-ExtraMovInst.ll
  llvm/test/CodeGen/X86/2006-11-17-IllegalMove.ll
  llvm/test/CodeGen/X86/2007-01-13-StackPtrIndex.ll
  llvm/test/CodeGen/X86/2007-03-01-SpillerCrash.ll
  llvm/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll
  llvm/test/CodeGen/X86/2008-04-16-ReMatBug.ll
  llvm/test/CodeGen/X86/2008-04-17-CoalescerBug.ll
  llvm/test/CodeGen/X86/2008-05-21-CoalescerBug.ll
  llvm/test/CodeGen/X86/2012-08-17-legalizer-crash.ll
  llvm/test/CodeGen/X86/AMX/amx-ldtilecfg-insert.ll
  llvm/test/CodeGen/X86/AMX/amx-lower-tile-copy.ll
  llvm/test/CodeGen/X86/AMX/amx-spill-merge.ll
  llvm/test/CodeGen/X86/atomic-unordered.ll
  llvm/test/CodeGen/X86/atomic64.ll
  llvm/test/CodeGen/X86/atomic6432.ll
  llvm/test/CodeGen/X86/avoid-sfb.ll
  llvm/test/CodeGen/X86/avx-cmp.ll
  llvm/test/CodeGen/X86/avx-load-store.ll
  llvm/test/CodeGen/X86/avx512-broadcast-unfold.ll
  llvm/test/CodeGen/X86/callbr-asm-blockplacement.ll
  llvm/test/CodeGen/X86/cmp-bool.ll
  llvm/test/CodeGen/X86/cmpxchg-clobber-flags.ll
  llvm/test/CodeGen/X86/copy-eflags.ll
  llvm/test/CodeGen/X86/cse-cmp.ll
  llvm/test/CodeGen/X86/fp128-cast.ll
  llvm/test/CodeGen/X86/fp128-select.ll
  llvm/test/CodeGen/X86/jump_sign.ll
  llvm/test/CodeGen/X86/lea.ll
  llvm/test/CodeGen/X86/legalize-shift-64.ll
  llvm/test/CodeGen/X86/loop-blocks.ll
  llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
  llvm/test/CodeGen/X86/machine-cse.ll
  llvm/test/CodeGen/X86/masked_compressstore.ll
  llvm/test/CodeGen/X86/masked_expandload.ll
  llvm/test/CodeGen/X86/masked_load.ll
  llvm/test/CodeGen/X86/masked_store.ll
  llvm/test/CodeGen/X86/masked_store_trunc.ll
  llvm/test/CodeGen/X86/masked_store_trunc_ssat.ll
  llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
  llvm/test/CodeGen/X86/musttail-varargs.ll
  llvm/test/CodeGen/X86/nobt.ll
  llvm/test/CodeGen/X86/optimize-max-0.ll
  llvm/test/CodeGen/X86/or-branch.ll
  llvm/test/CodeGen/X86/peephole-na-phys-copy-folding.ll
  llvm/test/CodeGen/X86/pr29170.ll
  llvm/test/CodeGen/X86/pr36602.ll
  llvm/test/CodeGen/X86/pr38217.ll
  llvm/test/CodeGen/X86/pr38795.ll
  llvm/test/CodeGen/X86/pr46585.ll
  llvm/test/CodeGen/X86/pr46827.ll
  llvm/test/CodeGen/X86/pr50254.ll
  llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll
  llvm/test/CodeGen/X86/reverse_branches.ll
  llvm/test/CodeGen/X86/select.ll
  llvm/test/CodeGen/X86/setcc-freeze.ll
  llvm/test/CodeGen/X86/shrink-compare-pgso.ll
  llvm/test/CodeGen/X86/shrink-compare.ll
  llvm/test/CodeGen/X86/sibcall.ll
  llvm/test/CodeGen/X86/slow-incdec.ll
  llvm/test/CodeGen/X86/speculative-execution-side-effect-suppression.ll
  llvm/test/CodeGen/X86/speculative-load-hardening-indirect.ll
  llvm/test/CodeGen/X86/statepoint-ra.ll
  llvm/test/CodeGen/X86/statepoint-vreg-details.ll
  llvm/test/CodeGen/X86/statepoint-vreg-invoke.ll
  llvm/test/CodeGen/X86/switch-bt.ll
  llvm/test/CodeGen/X86/switch-density.ll
  llvm/test/CodeGen/X86/switch-lower-peel-top-case.ll
  llvm/test/CodeGen/X86/switch.ll
  llvm/test/CodeGen/X86/tail-dup-asm-goto.ll
  llvm/test/CodeGen/X86/tail-dup-merge-loop-headers.ll
  llvm/test/CodeGen/X86/tail-opts.ll
  llvm/test/CodeGen/X86/vector-shift-by-select-loop.ll
  llvm/test/CodeGen/X86/widen_cast-1.ll
  llvm/test/CodeGen/X86/widen_cast-2.ll
  llvm/test/CodeGen/X86/x32-va_start.ll
  llvm/test/CodeGen/X86/x86-repmov-copy-eflags.ll
  llvm/test/CodeGen/X86/x86-shrink-wrapping.ll
  llvm/test/CodeGen/X86/xmulo.ll
  llvm/test/CodeGen/X86/xor-combine-debugloc.ll
  llvm/test/CodeGen/X86/xor-icmp.ll
  llvm/test/CodeGen/XCore/threads.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D110339.374571.patch
Type: text/x-patch
Size: 153833 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210923/cca9dbe2/attachment-0001.bin>


More information about the llvm-commits mailing list