[PATCH] D59139: [CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst

Tue Mar 26 11:53:59 PDT 2019

tejohnson added a comment.

This change caused a compile time regression (that I referenced at the end of my summary in D59696 <https://reviews.llvm.org/D59696>). In this case, there are huge numbers of select instructions. After this change, since we now update the ModifiedDT correctly, the loop over the function in CodeGenPrepare::runOnFunction will break after each select is optimized, due to ModifiedDT being set. As mentioned in D59696 <https://reviews.llvm.org/D59696>, even after making the DT build lazy, there is still a large regression because of the number of times we re-walk the function.

I have a couple possible solutions, and am looking for guidance on what is preferable.

1. Reset the DT instead of existing the function walk. Now that D59696 <https://reviews.llvm.org/D59696> is in and the DT is built lazily, instead of setting ModifiedDT true and breaking out of the function walk in runOnFunction() (which was previously required due to the DT being built for each function walk), we can simply reset the DT unique_ptr, which will force a rebuild of DT when it is next needed:

diff --git a/lib/CodeGen/CodeGenPrepare.cpp b/lib/CodeGen/CodeGenPrepare.cpp
index 9d642ba245c..bd9cc5e4158 100644

- a/lib/CodeGen/CodeGenPrepare.cpp

+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -5962,7 +5962,7 @@ bool CodeGenPrepare::optimizeSelectInst(SelectInst *SI, bool &ModifiedDT) {

    !isFormingBranchFromSelectProfitable(TTI, TLI, SI))
  return false;
   

- ModifiedDT = true;

+  DT.reset();

  // Transform a sequence like this:
  //    start:

In my test case, since apparently there are many more select instructions (which don't use the DT to optimize) than the type of instructions that need the DT, this fixes the regression. It may not always have an effect, say if selects were interspersed with instructions that need the DT to optimize, but it won't make matters worse than the current status quo (which would cause the function iteration to exit and the DT to be reset there). It's possible that other places that set ModifiedDT could get the same treatment, I just tried in optimizeSelectInst since that was the place affected by this patch and that caused my regression.

2. Optimize select insts before iterative function walk. Similar to how some of the other instructions are optimized before this iterative walk (e.g. splitBranchCondition()), since select instructions don't need the DT, and presumably don't need iterative optimization, we could do the select optimizations earlier:

diff --git a/lib/CodeGen/CodeGenPrepare.cpp b/lib/CodeGen/CodeGenPrepare.cpp
index 9d642ba245c..b2a0c8a9570 100644

- a/lib/CodeGen/CodeGenPrepare.cpp

+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -465,6 +465,15 @@ bool CodeGenPrepare::runOnFunction(Function &F) {

  // to help generate sane code for PHIs involving such edges.
  EverMadeChange |= SplitIndirectBrCriticalEdges(F);
   

+  for (Function::iterator I = F.begin(); I != F.end(); ) {
+    BasicBlock *BB = &*I++;
+    CurInstIterator = BB->begin();
+    while (CurInstIterator != BB->end()) {
+      if (SelectInst *SI = dyn_cast<SelectInst>(&*CurInstIterator++))
+        EverMadeChange |= optimizeSelectInst(SI, ModifiedDT);
+    }
+  }
+

  bool MadeChange = true;
  while (MadeChange) {
    MadeChange = false;

Presumably this would be refactored into a separate function, similar to splitBranchCondition, the above was just a quick hack to test. The advantage of this second approach is that it will improve compile time in any other cases involving select instructions interspersed with instructions that need the DT to optimize.

However - this only eliminates 2/3 of the additional CGP overhead. Since I didn't remove the call to optimizeSelectInst from optimizeInst in this case, we may be finding additional selects to optimize during the iterative function optimization stage.

3. Similar to 2 but do the select optimization once per function walk:

diff --git a/lib/CodeGen/CodeGenPrepare.cpp b/lib/CodeGen/CodeGenPrepare.cpp
index 9d642ba245c..2241d4af206 100644

- a/lib/CodeGen/CodeGenPrepare.cpp

+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -468,6 +468,14 @@ bool CodeGenPrepare::runOnFunction(Function &F) {

  bool MadeChange = true;
  while (MadeChange) {
    MadeChange = false;

+    for (Function::iterator I = F.begin(); I != F.end(); ) {
+      BasicBlock *BB = &*I++;
+      CurInstIterator = BB->begin();
+      while (CurInstIterator != BB->end()) {
+        if (SelectInst *SI = dyn_cast<SelectInst>(&*CurInstIterator++))
+          MadeChange |= optimizeSelectInst(SI, ModifiedDT);
+      }
+    }

  DT.reset();
  for (Function::iterator I = F.begin(); I != F.end(); ) {
    BasicBlock *BB = &*I++;

This removes the remaining overhead.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59139/new/

https://reviews.llvm.org/D59139