[PATCH] D65410: [PassManager] First Pass implementation at -O1 pass pipeline

Mon Aug 5 18:03:44 PDT 2019

chandlerc added a comment.

One high level point that is at least worth clarifying, and maybe others will want to suggest a different approach:

The overall approach here is to have as small of a difference between the O1 <https://reviews.llvm.org/owners/package/1/> and O2 <https://reviews.llvm.org/owners/package/2/> pipelines as possible.

An alternative approach that we could take would be to design a focused O1 <https://reviews.llvm.org/owners/package/1/> pipeline without regard to how much it diverges from the O2 <https://reviews.llvm.org/owners/package/2/> pipeline.

Which approach is used somewhat depends on the goals. I feel like the goal here is to get as close to the level of optimization at O2 <https://reviews.llvm.org/owners/package/2/> as possible without losing compile time or coherent backtraces for test / assertion failures. For that goal, the approach taken makes sense. But it seems important to clarify that goal as otherwise I think we'd want to go in very different directions.

In D65410#1613555 <https://reviews.llvm.org/D65410#1613555>, @hfinkel wrote:

> Thanks for starting on this. Can you go ahead and replace the sroa calls with mem2reg calls for `O1` and then see what that does to the performance? That strikes me as a major change, but certainly one that potentially makes sense, so I'd rather we go ahead and test it now before we make decisions about other adjustments.

I really think we need mem2reg at least at -O1... In fact, I really think we need SROA at O1 <https://reviews.llvm.org/owners/package/1/>. If it is actually a compile time problem, I'd like to fix that in SROA. I don't really expect it to be though.

> FWIW, I thought that we might run InstCombine less often (or maybe replace it with InstSimplify, in some places). Did you try that?

I think the biggest thing to do would be to avoid repeated runs of instcombine over the same code. I suspect we want at least one run after inliner and inside the CGSCC walk for canonicalization. But it'd be great to limit it to exactly one or maaaaybe one before and one after the loop pipeline.

================
Comment at: llvm/lib/Passes/PassBuilder.cpp:407-414
   }

   // Speculative execution if the target has divergent branches; otherwise nop.
-  FPM.addPass(SpeculativeExecutionPass());
+  if (Level > O1)
+    FPM.addPass(SpeculativeExecutionPass());

   // Optimize based on known information about branches, and cleanup afterward.
----------------
I think you can merge all of these?

================
Comment at: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp:362

+  // TODO: Investigate the cost/benefit of tail call elimination on debugging.
   MPM.add(createTailCallEliminationPass()); // Eliminate tail calls
----------------
hfinkel wrote:
> By definition, this loses information from the call stack, no?
Yeah, I'd have really expected this to be skipped.

================
Comment at: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp:432

+  // TODO: Investigate if this is too expensive at O1.
   MPM.add(createAggressiveDCEPass());         // Delete dead instructions
----------------
hfinkel wrote:
> Yes, I'd fall back to using regular DCE.
+1

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65410/new/

https://reviews.llvm.org/D65410