[PATCH] D71717: [MachineScheduler] Ignore artificial edges when forming store chains

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 19 10:03:16 PST 2019


foad created this revision.
Herald added subscribers: javed.absar, hiraditya, nhaehnle, jvesely, MatzeB.
Herald added a project: LLVM.
foad added reviewers: atrick, arsenm, rampitec, nhaehnle.
foad added a comment.
Herald added a subscriber: wdng.

Incidentally I did wonder if `SDep::isCtrl` itself should be taught that an artificial dependency should not be considered a control dependency.


BaseMemOpClusterMutation::apply forms store chains by looking for
control (i.e. non-data) dependencies from one mem op to another.

In the test case, clusterNeighboringMemOps successfully clusters the
stores, and then adds artificial edges to the stores' successors as
described in the comment:

  // Copy successor edges from SUa to SUb. Interleaving computation
  // dependent on SUa can prevent load combining due to register reuse.

The effect of this is that *data* dependencies from one load to a store
are copied as *artificial* dependencies from a different load to the
same store.

Then when BaseMemOpClusterMutation::apply looks at the stores, it finds
that some of them have a control dependency on a previous load, which
breaks the chains and means that the stores are not all considered part
of the same chain and won't all be clustered.

The fix is to only consider non-artificial control dependencies when
forming chains.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D71717

Files:
  llvm/lib/CodeGen/MachineScheduler.cpp
  llvm/test/CodeGen/AMDGPU/cluster_stores.ll


Index: llvm/test/CodeGen/AMDGPU/cluster_stores.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/cluster_stores.ll
@@ -0,0 +1,27 @@
+; RUN: llc -march=amdgcn -mcpu=gfx900 -debug-only=machine-scheduler -o /dev/null %s 2>&1 | FileCheck %s
+; REQUIRES: asserts
+
+define amdgpu_kernel void @f(i32* noalias %lb, i32* noalias %sb) {
+bb:
+; CHECK-COUNT-3: Cluster ld/st
+  %la0 = getelementptr inbounds i32, i32* %lb, i32 0
+  %ld0 = load i32, i32* %la0
+  %la1 = getelementptr inbounds i32, i32* %lb, i32 2
+  %ld1 = load i32, i32* %la1
+  %la2 = getelementptr inbounds i32, i32* %lb, i32 4
+  %ld2 = load i32, i32* %la2
+  %la3 = getelementptr inbounds i32, i32* %lb, i32 6
+  %ld3 = load i32, i32* %la3
+
+; CHECK-COUNT-3: Cluster ld/st
+  %sa0 = getelementptr inbounds i32, i32* %sb, i32 0
+  store i32 %ld0, i32* %sa0
+  %sa1 = getelementptr inbounds i32, i32* %sb, i32 2
+  store i32 %ld1, i32* %sa1
+  %sa2 = getelementptr inbounds i32, i32* %sb, i32 4
+  store i32 %ld2, i32* %sa2
+  %sa3 = getelementptr inbounds i32, i32* %sb, i32 6
+  store i32 %ld3, i32* %sa3
+
+  ret void
+}
Index: llvm/lib/CodeGen/MachineScheduler.cpp
===================================================================
--- llvm/lib/CodeGen/MachineScheduler.cpp
+++ llvm/lib/CodeGen/MachineScheduler.cpp
@@ -1609,7 +1609,7 @@
 
     unsigned ChainPredID = DAG->SUnits.size();
     for (const SDep &Pred : SU.Preds) {
-      if (Pred.isCtrl()) {
+      if (Pred.isCtrl() && !Pred.isArtificial()) {
         ChainPredID = Pred.getSUnit()->NodeNum;
         break;
       }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D71717.234743.patch
Type: text/x-patch
Size: 1612 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20191219/12122dce/attachment.bin>


More information about the llvm-commits mailing list