[llvm] [Hexagon] Add :mem_noshuf for store-load pairs with no scheduler Order dep (PR #181456)

via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 13 20:28:53 PST 2026


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-backend-hexagon

Author: Brian Cain (androm3da)

<details>
<summary>Changes</summary>

When TBAA tells the scheduler that a store and load access different types, the scheduler omits the Order (memory) dependency edge between them.  On V65+ the packetizer can then place the store in slot 1 and the load in slot 0 of the same packet.  Without :mem_noshuf the hardware is free to reorder the memory operations, which is unsound when the pointers actually alias at runtime (TBAA can be overly optimistic with type-punning patterns such as libc++ tree node casts).

Re-check aliasing with UseTBAA=false in the packetizer whenever a store-load pair has no scheduler Order dependency, and mark the packet :mem_noshuf if the accesses may alias.  Skip the re-check when either operand touches a PseudoSourceValue (stack slot, constant pool, GOT, jump table) since TBAA is not the source of the NoAlias conclusion for those accesses.

---
Full diff: https://github.com/llvm/llvm-project/pull/181456.diff


2 Files Affected:

- (modified) llvm/lib/Target/Hexagon/HexagonVLIWPacketizer.cpp (+50-1) 
- (added) llvm/test/CodeGen/Hexagon/packetize-mem-noshuf-tbaa.ll (+32) 


``````````diff
diff --git a/llvm/lib/Target/Hexagon/HexagonVLIWPacketizer.cpp b/llvm/lib/Target/Hexagon/HexagonVLIWPacketizer.cpp
index d39b79a86753a..e4dd49a05fbc8 100644
--- a/llvm/lib/Target/Hexagon/HexagonVLIWPacketizer.cpp
+++ b/llvm/lib/Target/Hexagon/HexagonVLIWPacketizer.cpp
@@ -1394,9 +1394,53 @@ bool HexagonPacketizerList::isLegalToPacketizeTogether(SUnit *SUI, SUnit *SUJ) {
       return false;
   }
 
+  // When the scheduler found no Order (memory) dependency between a
+  // store-load pair — either because there is no DAG edge at all, or
+  // because the only edges are Anti/register deps — the pair can land in
+  // the same V65+ packet.  Re-check aliasing without TBAA (TBAA may have
+  // been the reason the scheduler omitted the Order edge) and, if the
+  // accesses may alias, mark the packet :mem_noshuf so the hardware does
+  // not reorder the memory operations.
+  auto CheckMemNoshufForSlot1Store = [&]() {
+    if (!Slot1Store || !MF.getSubtarget<HexagonSubtarget>().hasV65Ops())
+      return;
+    bool LoadJ = J.mayLoad(), StoreJ = J.mayStore();
+    bool LoadI = I.mayLoad(), StoreI = I.mayStore();
+    bool NVStoreJ = HII->isNewValueStore(J);
+    bool NVStoreI = HII->isNewValueStore(I);
+    bool IsVecJ = HII->isHVXVec(J);
+    bool IsVecI = HII->isHVXVec(I);
+
+    if (((LoadJ && StoreI && !NVStoreI) || (StoreJ && LoadI && !NVStoreJ)) &&
+        (J.getOpcode() != Hexagon::S2_allocframe &&
+         I.getOpcode() != Hexagon::S2_allocframe) &&
+        (J.getOpcode() != Hexagon::L2_deallocframe &&
+         I.getOpcode() != Hexagon::L2_deallocframe) &&
+        (!HII->isMemOp(J) && !HII->isMemOp(I)) && (!IsVecJ && !IsVecI)) {
+      // If either instruction accesses a stack slot, constant pool, GOT,
+      // or jump table (PseudoSourceValue), the scheduler's TBAA-based
+      // NoAlias result is reliable — skip the re-check.  TBAA false
+      // positives only affect heap-to-heap accesses through different
+      // pointer types (e.g. libc++ tree node pointer casts).
+      auto HasPSV = [](const MachineInstr &MI) {
+        for (const MachineMemOperand *MMO : MI.memoperands())
+          if (MMO->getPseudoValue())
+            return true;
+        return false;
+      };
+      if (HasPSV(J) || HasPSV(I))
+        return;
+
+      if (J.mayAlias(AA, I, /*UseTBAA=*/false))
+        setmemShufDisabled(true);
+    }
+  };
+
   // There no dependency between a prolog instruction and its successor.
-  if (!SUJ->isSucc(SUI))
+  if (!SUJ->isSucc(SUI)) {
+    CheckMemNoshufForSlot1Store();
     return true;
+  }
 
   for (unsigned i = 0; i < SUJ->Succs.size(); ++i) {
     if (FoundSequentialDependence)
@@ -1628,6 +1672,11 @@ bool HexagonPacketizerList::isLegalToPacketizeTogether(SUnit *SUI, SUnit *SUJ) {
     return false;
   }
 
+  // The dependency loop found no blocking dependence — only Anti deps
+  // (or Order deps that the V65 Slot1Store path already handled).
+  // Still need to guard against a store-load pair whose Order dep was
+  // omitted by the scheduler due to TBAA.
+  CheckMemNoshufForSlot1Store();
   return true;
 }
 
diff --git a/llvm/test/CodeGen/Hexagon/packetize-mem-noshuf-tbaa.ll b/llvm/test/CodeGen/Hexagon/packetize-mem-noshuf-tbaa.ll
new file mode 100644
index 0000000000000..403f699ca028b
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/packetize-mem-noshuf-tbaa.ll
@@ -0,0 +1,32 @@
+; RUN: llc -march=hexagon -mcpu=hexagonv65 -O2 < %s | FileCheck %s
+
+; The scheduler uses TBAA to determine that the store and load below
+; access different types, so it does not create an Order dependency
+; between them.  On V65+ the packetizer can place a store in slot 1
+; and a load in slot 0 of the same packet.  Without :mem_noshuf the
+; hardware is free to reorder the memory operations, which is unsound
+; when the pointers actually alias at runtime.
+;
+; Verify that the packetizer adds :mem_noshuf to protect the packet
+; even though the scheduler found no memory dependency.
+
+; CHECK-LABEL: test_noshuf_no_sched_dep:
+; CHECK:      memw(r0+#0) = #1
+; CHECK:      r0 = memw(r1+#0)
+; CHECK:      } :mem_noshuf
+
+define i32 @test_noshuf_no_sched_dep(ptr %p, ptr %q) #0 {
+entry:
+  store i32 1, ptr %p, align 4, !tbaa !0
+  %v = load i32, ptr %q, align 4, !tbaa !3
+  ret i32 %v
+}
+
+attributes #0 = { nounwind }
+
+; Two unrelated TBAA type descriptors under the same root.
+!0 = !{!1, !1, i64 0}        ; store accesses "type_a"
+!1 = !{!"type_a", !2}
+!2 = !{!"tbaa_root"}
+!3 = !{!4, !4, i64 0}        ; load accesses "type_b"
+!4 = !{!"type_b", !2}

``````````

</details>


https://github.com/llvm/llvm-project/pull/181456


More information about the llvm-commits mailing list