[all-commits] [llvm/llvm-project] 870fd5: Reapply "RegAllocFast: Record internal state based...

Matt Arsenault via All-commits all-commits at lists.llvm.org
Fri Sep 18 11:05:39 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 870fd53e4f6357946f4bad0b861c510cd107420c
      https://github.com/llvm/llvm-project/commit/870fd53e4f6357946f4bad0b861c510cd107420c
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2020-09-18 (Fri, 18 Sep 2020)

  Changed paths:
    M llvm/lib/CodeGen/RegAllocFast.cpp
    M llvm/test/CodeGen/AArch64/arm64-fast-isel-conversion-fallback.ll
    M llvm/test/CodeGen/AArch64/arm64-fast-isel-conversion.ll
    M llvm/test/CodeGen/AArch64/arm64-vcvt_f.ll
    M llvm/test/CodeGen/AArch64/fast-isel-sp-adjust.ll
    M llvm/test/CodeGen/AArch64/popcount.ll
    M llvm/test/CodeGen/AMDGPU/indirect-addressing-term.ll
    M llvm/test/CodeGen/AMDGPU/partial-sgpr-to-vgpr-spills.ll
    M llvm/test/CodeGen/AMDGPU/spill-m0.ll
    M llvm/test/CodeGen/AMDGPU/wwm-reserved.ll
    M llvm/test/CodeGen/ARM/legalize-bitcast.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/fptosi_and_fptoui.ll
    M llvm/test/CodeGen/Mips/atomic-min-max.ll
    M llvm/test/CodeGen/Mips/atomic.ll
    M llvm/test/CodeGen/Mips/implicit-sret.ll
    M llvm/test/CodeGen/PowerPC/addegluecrash.ll
    M llvm/test/CodeGen/PowerPC/popcount.ll
    M llvm/test/CodeGen/PowerPC/vsx.ll
    M llvm/test/CodeGen/SPARC/fp16-promote.ll
    M llvm/test/CodeGen/X86/2009-04-14-IllegalRegs.ll
    M llvm/test/CodeGen/X86/atomic-unordered.ll
    M llvm/test/CodeGen/X86/atomic32.ll
    M llvm/test/CodeGen/X86/atomic64.ll
    M llvm/test/CodeGen/X86/avx-load-store.ll
    M llvm/test/CodeGen/X86/avx512-mask-zext-bugfix.ll
    M llvm/test/CodeGen/X86/crash-O0.ll
    M llvm/test/CodeGen/X86/extend-set-cc-uses-dbg.ll
    M llvm/test/CodeGen/X86/fast-isel-nontemporal.ll
    M llvm/test/CodeGen/X86/lvi-hardening-loads.ll
    M llvm/test/CodeGen/X86/mixed-ptr-sizes.ll
    M llvm/test/CodeGen/X86/pr1489.ll
    M llvm/test/CodeGen/X86/pr27591.ll
    M llvm/test/CodeGen/X86/pr30430.ll
    M llvm/test/CodeGen/X86/pr30813.ll
    M llvm/test/CodeGen/X86/pr32241.ll
    M llvm/test/CodeGen/X86/pr32284.ll
    M llvm/test/CodeGen/X86/pr32340.ll
    M llvm/test/CodeGen/X86/pr32345.ll
    M llvm/test/CodeGen/X86/pr32451.ll
    M llvm/test/CodeGen/X86/pr34592.ll
    M llvm/test/CodeGen/X86/pr39733.ll
    M llvm/test/CodeGen/X86/pr44749.ll
    M llvm/test/CodeGen/X86/pr47000.ll
    M llvm/test/CodeGen/X86/regalloc-fast-missing-live-out-spill.mir
    M llvm/test/CodeGen/X86/swift-return.ll
    M llvm/test/CodeGen/X86/swifterror.ll
    M llvm/test/DebugInfo/X86/op_deref.ll

  Log Message:
  -----------
  Reapply "RegAllocFast: Record internal state based on register units"

The regressions this caused should be fixed when
https://reviews.llvm.org/D52010 is applied.

This reverts commit a21387c65470417c58021f8d3194a4510bb64f46.


  Commit: c8757ff3aa7dd7a25a6343f6ef74a70c7be04325
      https://github.com/llvm/llvm-project/commit/c8757ff3aa7dd7a25a6343f6ef74a70c7be04325
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2020-09-18 (Fri, 18 Sep 2020)

  Changed paths:
    M llvm/lib/CodeGen/RegAllocFast.cpp
    M llvm/test/CodeGen/AArch64/GlobalISel/darwin-tls-call-clobber.ll
    M llvm/test/CodeGen/AArch64/arm64-fast-isel-br.ll
    M llvm/test/CodeGen/AArch64/arm64-fast-isel-call.ll
    M llvm/test/CodeGen/AArch64/arm64-fast-isel-conversion-fallback.ll
    M llvm/test/CodeGen/AArch64/arm64-fast-isel-conversion.ll
    M llvm/test/CodeGen/AArch64/arm64-vcvt_f.ll
    M llvm/test/CodeGen/AArch64/arm64_32-fastisel.ll
    M llvm/test/CodeGen/AArch64/arm64_32-null.ll
    M llvm/test/CodeGen/AArch64/br-cond-not-merge.ll
    M llvm/test/CodeGen/AArch64/cmpxchg-O0.ll
    M llvm/test/CodeGen/AArch64/combine-loads.ll
    M llvm/test/CodeGen/AArch64/fast-isel-cmpxchg.ll
    M llvm/test/CodeGen/AArch64/popcount.ll
    M llvm/test/CodeGen/AArch64/swift-return.ll
    M llvm/test/CodeGen/AArch64/swifterror.ll
    M llvm/test/CodeGen/AArch64/unwind-preserved-from-mir.mir
    M llvm/test/CodeGen/AArch64/unwind-preserved.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inline-asm.ll
    M llvm/test/CodeGen/AMDGPU/control-flow-fastregalloc.ll
    A llvm/test/CodeGen/AMDGPU/fast-ra-kills-vcc.mir
    A llvm/test/CodeGen/AMDGPU/fastregalloc-illegal-subreg-physreg.mir
    M llvm/test/CodeGen/AMDGPU/fastregalloc-self-loop-heuristic.mir
    M llvm/test/CodeGen/AMDGPU/indirect-addressing-term.ll
    M llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll
    M llvm/test/CodeGen/AMDGPU/partial-sgpr-to-vgpr-spills.ll
    M llvm/test/CodeGen/AMDGPU/reserve-vgpr-for-sgpr-spill.ll
    M llvm/test/CodeGen/AMDGPU/spill-agpr.mir
    M llvm/test/CodeGen/AMDGPU/spill-m0.ll
    M llvm/test/CodeGen/AMDGPU/spill192.mir
    A llvm/test/CodeGen/AMDGPU/unexpected-reg-unit-state.mir
    M llvm/test/CodeGen/AMDGPU/wwm-reserved.ll
    M llvm/test/CodeGen/ARM/2010-08-04-StackVariable.ll
    M llvm/test/CodeGen/ARM/Windows/alloca.ll
    M llvm/test/CodeGen/ARM/cmpxchg-O0-be.ll
    M llvm/test/CodeGen/ARM/cmpxchg-O0.ll
    M llvm/test/CodeGen/ARM/crash-greedy-v6.ll
    M llvm/test/CodeGen/ARM/debug-info-blocks.ll
    M llvm/test/CodeGen/ARM/fast-isel-call.ll
    M llvm/test/CodeGen/ARM/fast-isel-intrinsic.ll
    M llvm/test/CodeGen/ARM/fast-isel-ldr-str-thumb-neg-index.ll
    M llvm/test/CodeGen/ARM/fast-isel-select.ll
    M llvm/test/CodeGen/ARM/fast-isel-vararg.ll
    M llvm/test/CodeGen/ARM/ldrd.ll
    M llvm/test/CodeGen/ARM/legalize-bitcast.ll
    M llvm/test/CodeGen/ARM/stack-guard-reassign.ll
    M llvm/test/CodeGen/ARM/swifterror.ll
    M llvm/test/CodeGen/ARM/thumb-big-stack.ll
    M llvm/test/CodeGen/Hexagon/vect/vect-load-v4i16.ll
    M llvm/test/CodeGen/Mips/Fast-ISel/callabi.ll
    M llvm/test/CodeGen/Mips/Fast-ISel/memtest1.ll
    M llvm/test/CodeGen/Mips/Fast-ISel/pr40325.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/add.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/add_vec.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/aggregate_struct_return.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/bitreverse.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/bitwise.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/branch.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/brindirect.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/bswap.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/call.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/ctlz.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/cttz.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/dyn_stackalloc.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/fcmp.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/float_constants.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/fptosi_and_fptoui.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/global_address.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/global_address_pic.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/icmp.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/jump_table_and_brjt.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/load_4_unaligned.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/load_split_because_of_memsize_or_align.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/long_ambiguous_chain_s32.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/long_ambiguous_chain_s64.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/mul.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/mul_vec.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/phi.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/rem_and_div.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/select.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/sitofp_and_uitofp.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/store_4_unaligned.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/store_split_because_of_memsize_or_align.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/sub.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/sub_vec.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/test_TypeInfoforMF.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/var_arg.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/zextLoad_and_sextLoad.ll
    M llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/zext_and_sext.ll
    M llvm/test/CodeGen/Mips/atomic-min-max.ll
    M llvm/test/CodeGen/Mips/atomic.ll
    M llvm/test/CodeGen/Mips/atomic64.ll
    M llvm/test/CodeGen/Mips/atomicCmpSwapPW.ll
    M llvm/test/CodeGen/Mips/copy-fp64.ll
    M llvm/test/CodeGen/Mips/implicit-sret.ll
    M llvm/test/CodeGen/Mips/micromips-eva.mir
    M llvm/test/CodeGen/Mips/msa/ldr_str.ll
    M llvm/test/CodeGen/PowerPC/addegluecrash.ll
    M llvm/test/CodeGen/PowerPC/aggressive-anti-dep-breaker-subreg.ll
    M llvm/test/CodeGen/PowerPC/aix-overflow-toc.py
    M llvm/test/CodeGen/PowerPC/anon_aggr.ll
    M llvm/test/CodeGen/PowerPC/builtins-ppc-p10vsx.ll
    M llvm/test/CodeGen/PowerPC/elf-common.ll
    M llvm/test/CodeGen/PowerPC/fast-isel-pcrel.ll
    M llvm/test/CodeGen/PowerPC/fp-int128-fp-combine.ll
    M llvm/test/CodeGen/PowerPC/fp-strict-fcmp-noopt.ll
    M llvm/test/CodeGen/PowerPC/fp64-to-int16.ll
    M llvm/test/CodeGen/PowerPC/p9-vinsert-vextract.ll
    M llvm/test/CodeGen/PowerPC/popcount.ll
    M llvm/test/CodeGen/PowerPC/spill-nor0.ll
    A llvm/test/CodeGen/PowerPC/spill-nor0.mir
    M llvm/test/CodeGen/PowerPC/stack-guard-reassign.ll
    M llvm/test/CodeGen/PowerPC/vsx-args.ll
    M llvm/test/CodeGen/PowerPC/vsx.ll
    M llvm/test/CodeGen/SPARC/fp16-promote.ll
    M llvm/test/CodeGen/SystemZ/swift-return.ll
    M llvm/test/CodeGen/SystemZ/swifterror.ll
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/branch-targets.ll
    M llvm/test/CodeGen/Thumb2/high-reg-spill.mir
    M llvm/test/CodeGen/Thumb2/mve-vector-spill.ll
    M llvm/test/CodeGen/X86/2009-04-14-IllegalRegs.ll
    M llvm/test/CodeGen/X86/2010-06-28-FastAllocTiedOperand.ll
    M llvm/test/CodeGen/X86/2013-10-14-FastISel-incorrect-vreg.ll
    M llvm/test/CodeGen/X86/atomic-monotonic.ll
    M llvm/test/CodeGen/X86/atomic-unordered.ll
    M llvm/test/CodeGen/X86/atomic32.ll
    M llvm/test/CodeGen/X86/atomic64.ll
    M llvm/test/CodeGen/X86/atomic6432.ll
    M llvm/test/CodeGen/X86/avx-load-store.ll
    M llvm/test/CodeGen/X86/avx512-mask-zext-bugfix.ll
    A llvm/test/CodeGen/X86/bug47278-eflags-error.mir
    A llvm/test/CodeGen/X86/bug47278.mir
    M llvm/test/CodeGen/X86/crash-O0.ll
    M llvm/test/CodeGen/X86/extend-set-cc-uses-dbg.ll
    M llvm/test/CodeGen/X86/fast-isel-cmp-branch.ll
    M llvm/test/CodeGen/X86/fast-isel-nontemporal.ll
    M llvm/test/CodeGen/X86/fast-isel-select-sse.ll
    M llvm/test/CodeGen/X86/fast-isel-select.ll
    M llvm/test/CodeGen/X86/fast-isel-x86-64.ll
    M llvm/test/CodeGen/X86/mixed-ptr-sizes-i686.ll
    M llvm/test/CodeGen/X86/mixed-ptr-sizes.ll
    M llvm/test/CodeGen/X86/phys-reg-local-regalloc.ll
    M llvm/test/CodeGen/X86/pr11415.ll
    M llvm/test/CodeGen/X86/pr1489.ll
    M llvm/test/CodeGen/X86/pr27591.ll
    M llvm/test/CodeGen/X86/pr30430.ll
    M llvm/test/CodeGen/X86/pr30813.ll
    M llvm/test/CodeGen/X86/pr32241.ll
    M llvm/test/CodeGen/X86/pr32284.ll
    M llvm/test/CodeGen/X86/pr32340.ll
    M llvm/test/CodeGen/X86/pr32345.ll
    M llvm/test/CodeGen/X86/pr32451.ll
    M llvm/test/CodeGen/X86/pr32484.ll
    M llvm/test/CodeGen/X86/pr34592.ll
    M llvm/test/CodeGen/X86/pr34653.ll
    M llvm/test/CodeGen/X86/pr39733.ll
    M llvm/test/CodeGen/X86/pr42452.ll
    M llvm/test/CodeGen/X86/pr44749.ll
    M llvm/test/CodeGen/X86/pr47000.ll
    M llvm/test/CodeGen/X86/regalloc-fast-missing-live-out-spill.mir
    M llvm/test/CodeGen/X86/stack-protector-msvc.ll
    M llvm/test/CodeGen/X86/stack-protector-strong-macho-win32-xor.ll
    M llvm/test/CodeGen/X86/swift-return.ll
    M llvm/test/CodeGen/X86/swifterror.ll
    M llvm/test/CodeGen/X86/volatile.ll
    M llvm/test/CodeGen/X86/win64_eh.ll
    M llvm/test/CodeGen/X86/x86-32-intrcc.ll
    M llvm/test/CodeGen/X86/x86-64-intrcc.ll
    M llvm/test/DebugInfo/AArch64/frameindices.ll
    M llvm/test/DebugInfo/AArch64/prologue_end.ll
    M llvm/test/DebugInfo/ARM/prologue_end.ll
    M llvm/test/DebugInfo/Mips/delay-slot.ll
    M llvm/test/DebugInfo/Mips/prologue_end.ll
    M llvm/test/DebugInfo/X86/dbg-declare-arg.ll
    M llvm/test/DebugInfo/X86/fission-ranges.ll
    M llvm/test/DebugInfo/X86/op_deref.ll
    M llvm/test/DebugInfo/X86/parameters.ll
    M llvm/test/DebugInfo/X86/pieces-1.ll
    M llvm/test/DebugInfo/X86/prologue-stack.ll
    M llvm/test/DebugInfo/X86/reference-argument.ll
    M llvm/test/DebugInfo/X86/spill-indirect-nrvo.ll
    M llvm/test/DebugInfo/X86/sret.ll
    M llvm/test/DebugInfo/X86/subreg.ll

  Log Message:
  -----------
  RegAllocFast: Rewrite and improve

This rewrites big parts of the fast register allocator. The basic
strategy of doing block-local allocation hasn't changed but I tweaked
several details:

Track register state on register units instead of physical
registers. This simplifies and speeds up handling of register aliases.
Process basic blocks in reverse order: Definitions are known to end
register livetimes when walking backwards (contrary when walking
forward then uses may or may not be a kill so we need heuristics).

Check register mask operands (calls) instead of conservatively
assuming everything is clobbered.  Enhance heuristics to detect
killing uses: In case of a small number of defs/uses check if they are
all in the same basic block and if so the last one is a killing use.
Enhance heuristic for copy-coalescing through hinting: We check the
first k defs of a register for COPYs rather than relying on there just
being a single definition.  When testing this on the full llvm
test-suite including SPEC externals I measured:

average 5.1% reduction in code size for X86, 4.9% reduction in code on
aarch64. (ranging between 0% and 20% depending on the test) 0.5%
faster compiletime (some analysis suggests the pass is slightly slower
than before, but we more than make up for it because later passes are
faster with the reduced instruction count)

Also adds a few testcases that were broken without this patch, in
particular bug 47278.

Patch mostly by Matthias Braun


  Commit: 3105d0f84bfa6b765bb88cbf090f557e588764ea
      https://github.com/llvm/llvm-project/commit/3105d0f84bfa6b765bb88cbf090f557e588764ea
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2020-09-18 (Fri, 18 Sep 2020)

  Changed paths:
    M llvm/include/llvm/CodeGen/MachineBasicBlock.h
    M llvm/lib/CodeGen/MachineBasicBlock.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp

  Log Message:
  -----------
  CodeGen: Move split block utility to MachineBasicBlock

AMDGPU needs this in several places, so consolidate them here.


Compare: https://github.com/llvm/llvm-project/compare/91aed9bf975f...3105d0f84bfa


More information about the All-commits mailing list