[llvm] [Hexagon] Fix O(N^2) compile-time regression in HexagonOptAddrMode (PR #189531)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 31 01:37:17 PDT 2026
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-hexagon
Author: Abinaya Saravanan (quic-asaravan)
<details>
<summary>Changes</summary>
In HexagonOptAddrMode::processAddUses, isSafeToExtLR was called inside
the loop over UNodeList with loop-invariant arguments. isSafeToExtLR
iterates over UNodeList, so the total work was O(N^2) in the number of
uses.
The arguments (AddSN, AddMI, BaseReg, UNodeList) do not change across
iterations. Move the call to after the loop; the function returns the
same value regardless of which iteration calls it, and the complexity
drops to O(N).
Background
----------
Commit 8c0483bba2d2 ("RegisterCoalescer: Fix assert on remat to
copy-to-physreg with subregs") introduced register coalescer
rematerialization changes that produce additional uses of A2_addi
instructions on the Hexagon backend, inflating UNodeList. This exposed
the pre-existing O(N^2) behavior in processAddUses.
Measurements
------------
Input: rkvdec-vdpu383-h264.i (Hexagon kernel driver, -O2)
Tool: hexagon-linux-musl-clang (clang-20)
Scenario | HexagonOptAddrMode | Total
--------------------------------------|-------------------|---------
Before blamed commit (baseline) | 1.35 s | ~195 s
After blamed commit, without fix | 221,845 s | >61 h
After blamed commit, with fix | 52.16 s | ~225 s
Fixes: https://github.com/llvm/llvm-project/issues/178535
---
Full diff: https://github.com/llvm/llvm-project/pull/189531.diff
2 Files Affected:
- (modified) llvm/lib/Target/Hexagon/HexagonOptAddrMode.cpp (+12-10)
- (added) llvm/test/CodeGen/Hexagon/opt-addr-mode-large-unodelist.ll (+33)
``````````diff
diff --git a/llvm/lib/Target/Hexagon/HexagonOptAddrMode.cpp b/llvm/lib/Target/Hexagon/HexagonOptAddrMode.cpp
index f266678071b5b..7a4b51ad11082 100644
--- a/llvm/lib/Target/Hexagon/HexagonOptAddrMode.cpp
+++ b/llvm/lib/Target/Hexagon/HexagonOptAddrMode.cpp
@@ -723,18 +723,20 @@ bool HexagonOptAddrMode::processAddUses(NodeAddr<StmtNode *> AddSN,
int64_t newOffset = OffsetOp.getImm() + AddMI->getOperand(2).getImm();
if (!isValidOffset(MI, newOffset))
return false;
-
- // Since we'll be extending the live range of Rt in the following example,
- // make sure that is safe. another definition of Rt doesn't exist between 'add'
- // and load/store instruction.
- //
- // Ex: Rx= add(Rt,#10)
- // memw(Rx+#0) = Rs
- // will be replaced with => memw(Rt+#10) = Rs
- if (!isSafeToExtLR(AddSN, AddMI, BaseReg, UNodeList))
- return false;
}
+ // Since we'll be extending the live range of Rt in the following example,
+ // make sure that is safe. another definition of Rt doesn't exist between
+ // 'add' and load/store instruction.
+ //
+ // Ex: Rx= add(Rt,#10)
+ // memw(Rx+#0) = Rs
+ // will be replaced with => memw(Rt+#10) = Rs
+ // Note: isSafeToExtLR arguments are loop-invariant; call it once after
+ // validating all uses to avoid O(N^2) behavior when UNodeList is large.
+ if (!isSafeToExtLR(AddSN, AddMI, BaseReg, UNodeList))
+ return false;
+
NodeId LRExtRegRD = 0;
// Iterate through all the UseNodes in SN and find the reaching def
// for the LRExtReg.
diff --git a/llvm/test/CodeGen/Hexagon/opt-addr-mode-large-unodelist.ll b/llvm/test/CodeGen/Hexagon/opt-addr-mode-large-unodelist.ll
new file mode 100644
index 0000000000000..a5e5ce2b8a7be
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/opt-addr-mode-large-unodelist.ll
@@ -0,0 +1,33 @@
+; RUN: llc -mtriple=hexagon -O2 < %s | FileCheck %s
+
+; Verify that processAddUses correctly folds an A2_addi into multiple
+; store offsets. With many uses, isSafeToExtLR must be called once
+; (not once per use) to avoid O(N^2) compile time.
+
+; The A2_addi computing the base address should be eliminated; all
+; stores should use the original base register with the combined offset.
+
+; CHECK-LABEL: f0:
+; CHECK-NOT: = add(r
+
+define void @f0(ptr %base, i8 %a, i8 %b, i8 %c, i8 %d,
+ i8 %e, i8 %f, i8 %g, i8 %h) nounwind {
+entry:
+ %p0 = getelementptr inbounds i8, ptr %base, i32 13
+ store i8 %a, ptr %p0, align 1
+ %p1 = getelementptr inbounds i8, ptr %p0, i32 1
+ store i8 %b, ptr %p1, align 1
+ %p2 = getelementptr inbounds i8, ptr %p0, i32 2
+ store i8 %c, ptr %p2, align 1
+ %p3 = getelementptr inbounds i8, ptr %p0, i32 3
+ store i8 %d, ptr %p3, align 1
+ %p4 = getelementptr inbounds i8, ptr %p0, i32 4
+ store i8 %e, ptr %p4, align 1
+ %p5 = getelementptr inbounds i8, ptr %p0, i32 5
+ store i8 %f, ptr %p5, align 1
+ %p6 = getelementptr inbounds i8, ptr %p0, i32 6
+ store i8 %g, ptr %p6, align 1
+ %p7 = getelementptr inbounds i8, ptr %p0, i32 7
+ store i8 %h, ptr %p7, align 1
+ ret void
+}
``````````
</details>
https://github.com/llvm/llvm-project/pull/189531
More information about the llvm-commits
mailing list