[llvm] [MachineLICM] Workaround - apply RegMasks conservatively (PR #95926)
Pierre van Houtryve via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 19 00:31:11 PDT 2024
================
@@ -0,0 +1,49 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=aarch64-unknown-linux-gnu -run-pass=greedy,machinelicm -verify-machineinstrs -debug -o - %s | FileCheck %s
+
+# FIXME: Running RA is needed otherwise it runs pre-RA LICM.
+---
+name: test
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: test
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: liveins: $x0, $w1, $x2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: B %bb.1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ ; CHECK-NEXT: liveins: $x0, $w1, $x2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: renamable $q11 = MOVIv4i32 2, 8
+ ; CHECK-NEXT: BL &memset, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit $w1, implicit $x2, implicit-def $sp, implicit-def $x0
+ ; CHECK-NEXT: renamable $q10 = MVNIv4i32 4, 0
----------------
Pierre-vh wrote:
We could perhaps do a targeted fix for AArch64 too, and create an artificial "high" register for Q registers to model this.
@davemgreen What do you think about that?
To recap, we have 3 options:
- revert the MachineLICM change
- As the change is too good for AMDGPU, I would instead implement the 2 approaches side-by-side and use TRI to switch between both. All targets would use the new approach except AArch.
- Fix regunits calculations
- Perfect fix in theory, in practice I feel like this could have a big impact on regalloc as a whole so I'm pessimistic it'd be a small change that lands quickly
- Add fake high registers for AArch64 Q registers
- I think we do that for some registers on AMDGPU already. On one hand it can be seen as a the right thing to do to model high bits properly, OTOH it's a hack around shortcomings of the register modeling we have now in LLVM.
Option 1 vs Option 3 is basically a decision or whether we want the workaround to be in the backend, or in the pass. I think option 2 needs to be done anyway but I expect it'll be a longer task.
https://github.com/llvm/llvm-project/pull/95926
More information about the llvm-commits
mailing list