[PATCH] D8705: ScheduleDAGInstrs::buildSchedGraph() handling of memory dependecies rewritten.

Fri Jan 29 17:51:40 PST 2016

atrick added a comment.

Considering the nasty bugs we've had in tree for months (as a result of some attempts to fake AA without enabling it), and based on the discussion today, I can see that it's time to converge the AA and non-AA scheduling. The --enable-aa-sched-mi flag and ST.useAA hook can simply control whether we actually call the AA analysis. That way targets can still isolate themselves from potential bugs in MI-level AliasAnalysis, but the DAG builder logic will be common across targets and free of special cases.

As a result, non-AA targets may have to deal with bloated DAGs. But AA targets have to deal with the same problem. We can deal with that through bug reports, all work toward a common solution, and encourage more targets to adopt AA scheduling.

This may contradict my earlier advice, but now I can see that it's the only way to get this code in a maintainable state. I also realize that targets that care most about the scheduler's compile time (GPU) really need the same level of DAG support for alias analysis as targets that have been using AA already. We might as well have a common solution.

So my suggestion for this patch is:

- a/lib/CodeGen/ScheduleDAGInstrs.cpp

+++ b/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -588,11 +588,6 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA, const MachineFrameInfo *MFI,

  assert ((MIa->mayStore() || MIb->mayStore()) &&
          "Dependency checked between two loads");
   

- // buildSchedGraph() will clear list of stores if not using AA,
- // which means all stores have to be chained without AA.
- if (!AA && MIb->mayStore())
- return true; - // Let the target decide if memory accesses cannot possibly overlap. if (TII->areMemAccessesTriviallyDisjoint(MIa, MIb, AA)) return false;

@@ -1059,11 +1054,6 @@ void ScheduleDAGInstrs::buildSchedGraph(AliasAnalysis *AA,

  addChainDependencies(SU, Loads);
  addChainDependencies(SU, NonAliasLoads);
   

- // If we're not using AA, clear Stores map since all stores
- // will be chained.
- if (!AAForDep)
- Stores.clear(); - // Map this store to 'UnknownValue'. Stores.insert(SU, UnknownValue); continue;

@@ -1081,10 +1071,6 @@ void ScheduleDAGInstrs::buildSchedGraph(AliasAnalysis *AA,

  addChainDependencies(SU, stores_, V);
  addChainDependencies(SU, (ThisMayAlias ? Loads : NonAliasLoads), V);
   

- // If we're not using AA, then we only need one store per object.
- if (!AAForDep)
- stores_.clearList(V); - // Map this store to V. stores_.insert(SU, V); }

diff --git a/test/CodeGen/AArch64/arm64-misched-memdep-bug.ll b/test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
index a9ddb73..292fbb7 100644

- a/test/CodeGen/AArch64/arm64-misched-memdep-bug.ll

+++ b/test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
@@ -8,7 +8,7 @@
 ; CHECK: SU(2):   %vreg2<def> = LDRWui %vreg0, 1; mem:LD4[%ptr1_plus1] GPR32:%vreg2 GPR64common:%vreg0
 ; CHECK:   Successors:
 ; CHECK-NEXT:    val SU(5): Latency=4 Reg=%vreg2
-; CHECK-NEXT:    ch  SU(3): Latency=0
+; CHECK-NEXT:    ch  SU(4): Latency=0
 ; CHECK: SU(3):   STRWui %WZR, %vreg0, 0; mem:ST4[%ptr1] GPR64common:%vreg0
 ; CHECK:   Successors:
 ; CHECK: ch  SU(4): Latency=0
diff --git a/test/CodeGen/AArch64/tailcall_misched_graph.ll b/test/CodeGen/AArch64/tailcall_misched_graph.ll
index 343ffab..59a3be9 100644

- a/test/CodeGen/AArch64/tailcall_misched_graph.ll

+++ b/test/CodeGen/AArch64/tailcall_misched_graph.ll
@@ -37,6 +37,8 @@ declare void @callee2(i8*, i8*, i8*, i8*, i8*,
 ; CHECK: SU({{.*}}):   [[VRB]]<def> = LDRXui <fi#-2>
 ; CHECK-NOT: SU
 ; CHECK:  Successors:
-; CHECK:   ch  SU([[DEPSTORE:.*]]): Latency=0
+; CHECK:   ch  SU([[DEPSTOREB:.*]]): Latency=0
+; CHECK:   ch  SU([[DEPSTOREA:.*]]): Latency=0

-; CHECK: SU([[DEPSTORE]]):   STRXui %vreg0, <fi#-4>
+; CHECK: SU([[DEPSTOREA]]):   STRXui %vreg{{.*}}, <fi#-4>
+; CHECK: SU([[DEPSTOREB]]):   STRXui %vreg{{.*}}, <fi#-3>


http://reviews.llvm.org/D8705