[llvm] r331381 - [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions.

Daniel Sanders via llvm-commits llvm-commits at lists.llvm.org
Wed May 2 10:59:16 PDT 2018


Author: dsanders
Date: Wed May  2 10:59:16 2018
New Revision: 331381

URL: http://llvm.org/viewvc/llvm-project?rev=331381&view=rev
Log:
[reassociate] Fix excessive revisits when processing long chains of reassociatable instructions.

Summary:
Some of our internal testing detected a major compile time regression which I've
tracked down to:
    r278938 - Revert "Reassociate: Reprocess RedoInsts after each inst".
It appears that processing long chains of reassociatable instructions causes
non-linear (potentially exponential) growth in the number of times an
instruction is revisited. For example, the included test revisits instructions
220 times in a 20-instruction test.

It appears that r278938 reversed the order instructions were visited and that
this is preventing scheduled revisits from being cancelled as a result of
visiting the instructions naturally during normal processing. However, simply
reversing the order also harmed the generated code. Upon closer inspection, it
was discovered that revisits occurred in the opposite order to the first pass
(Thanks to escha for spotting that).

This patch makes the revisit order consistent with the first pass which allows
more revisits to be cancelled. This does appear to have a small impact on the
generated code in few cases but it significantly reduces compile-time.

After this patch, our internal test that was most affected by the regression
dropped from ~2 million revisits to ~4k resulting in Reassociate having 0.46%
of the runtime it had before (99.54% improvement).

Here's the summaries reported by lnt for the LLVM test-suite with --benchmarking-only:
| metric         | geomean before patch | geomean after patch | delta   |
| -----          | -----                | -----               | -----   |
| compile time   | 0.1956               | 0.1261              | -35.54% |
| execution time | 0.3240               | 0.3237              | -       |
| code size      | 7365.4459            | 7365.6079           | -       |

The results have a few wins and losses on compile-time, mostly in the +/- 2.5% range. There was one outlier though:
| Performance Regressions - compile_time | Δ | Previous | Current |
| MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk | 9.82% | 2.0473 | 2.2483 |

Reviewers: javed.absar, dberlin

Reviewed By: dberlin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45734

Added:
    llvm/trunk/test/Transforms/Reassociate/long-chains.ll
Modified:
    llvm/trunk/include/llvm/Transforms/Scalar/Reassociate.h
    llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp

Modified: llvm/trunk/include/llvm/Transforms/Scalar/Reassociate.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Scalar/Reassociate.h?rev=331381&r1=331380&r2=331381&view=diff
==============================================================================
--- llvm/trunk/include/llvm/Transforms/Scalar/Reassociate.h (original)
+++ llvm/trunk/include/llvm/Transforms/Scalar/Reassociate.h Wed May  2 10:59:16 2018
@@ -29,6 +29,7 @@
 #include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/ValueHandle.h"
+#include <deque>
 
 namespace llvm {
 
@@ -69,9 +70,14 @@ class XorOpnd;
 
 /// Reassociate commutative expressions.
 class ReassociatePass : public PassInfoMixin<ReassociatePass> {
+public:
+  using OrderedSet =
+      SetVector<AssertingVH<Instruction>, std::deque<AssertingVH<Instruction>>>;
+
+protected:
   DenseMap<BasicBlock *, unsigned> RankMap;
   DenseMap<AssertingVH<Value>, unsigned> ValueRankMap;
-  SetVector<AssertingVH<Instruction>> RedoInsts;
+  OrderedSet RedoInsts;
 
   // Arbitrary, but prevents quadratic behavior.
   static const unsigned GlobalReassociateLimit = 10;
@@ -108,8 +114,7 @@ private:
                      SmallVectorImpl<reassociate::ValueEntry> &Ops);
   Value *RemoveFactorFromExpression(Value *V, Value *Factor);
   void EraseInst(Instruction *I);
-  void RecursivelyEraseDeadInsts(Instruction *I,
-                                 SetVector<AssertingVH<Instruction>> &Insts);
+  void RecursivelyEraseDeadInsts(Instruction *I, OrderedSet &Insts);
   void OptimizeInst(Instruction *I);
   Instruction *canonicalizeNegConstExpr(Instruction *I);
   void BuildPairMap(ReversePostOrderTraversal<Function *> &RPOT);

Modified: llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp?rev=331381&r1=331380&r2=331381&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp Wed May  2 10:59:16 2018
@@ -810,7 +810,7 @@ void ReassociatePass::RewriteExprTree(Bi
 /// pushing the negates through adds.  These will be revisited to see if
 /// additional opportunities have been exposed.
 static Value *NegateValue(Value *V, Instruction *BI,
-                          SetVector<AssertingVH<Instruction>> &ToRedo) {
+                          ReassociatePass::OrderedSet &ToRedo) {
   if (auto *C = dyn_cast<Constant>(V))
     return C->getType()->isFPOrFPVectorTy() ? ConstantExpr::getFNeg(C) :
                                               ConstantExpr::getNeg(C);
@@ -924,8 +924,8 @@ static bool ShouldBreakUpSubtract(Instru
 
 /// If we have (X-Y), and if either X is an add, or if this is only used by an
 /// add, transform this into (X+(0-Y)) to promote better reassociation.
-static BinaryOperator *
-BreakUpSubtract(Instruction *Sub, SetVector<AssertingVH<Instruction>> &ToRedo) {
+static BinaryOperator *BreakUpSubtract(Instruction *Sub,
+                                       ReassociatePass::OrderedSet &ToRedo) {
   // Convert a subtract into an add and a neg instruction. This allows sub
   // instructions to be commuted with other add instructions.
   //
@@ -1871,8 +1871,8 @@ Value *ReassociatePass::OptimizeExpressi
 
 // Remove dead instructions and if any operands are trivially dead add them to
 // Insts so they will be removed as well.
-void ReassociatePass::RecursivelyEraseDeadInsts(
-    Instruction *I, SetVector<AssertingVH<Instruction>> &Insts) {
+void ReassociatePass::RecursivelyEraseDeadInsts(Instruction *I,
+                                                OrderedSet &Insts) {
   assert(isInstructionTriviallyDead(I) && "Trivially dead instructions only!");
   SmallVector<Value *, 4> Ops(I->op_begin(), I->op_end());
   ValueRankMap.erase(I);
@@ -2333,7 +2333,7 @@ PreservedAnalyses ReassociatePass::run(F
 
     // Make a copy of all the instructions to be redone so we can remove dead
     // instructions.
-    SetVector<AssertingVH<Instruction>> ToRedo(RedoInsts);
+    OrderedSet ToRedo(RedoInsts);
     // Iterate over all instructions to be reevaluated and remove trivially dead
     // instructions. If any operand of the trivially dead instruction becomes
     // dead mark it for deletion as well. Continue this process until all
@@ -2349,7 +2349,8 @@ PreservedAnalyses ReassociatePass::run(F
     // Now that we have removed dead instructions, we can reoptimize the
     // remaining instructions.
     while (!RedoInsts.empty()) {
-      Instruction *I = RedoInsts.pop_back_val();
+      Instruction *I = RedoInsts.front();
+      RedoInsts.erase(RedoInsts.begin());
       if (isInstructionTriviallyDead(I))
         EraseInst(I);
       else

Added: llvm/trunk/test/Transforms/Reassociate/long-chains.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Reassociate/long-chains.ll?rev=331381&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/Reassociate/long-chains.ll (added)
+++ llvm/trunk/test/Transforms/Reassociate/long-chains.ll Wed May  2 10:59:16 2018
@@ -0,0 +1,37 @@
+; RUN: opt < %s -reassociate -stats -S 2>&1 | FileCheck %s
+; REQUIRES: asserts
+
+define i8 @longchain(i8 %in1, i8 %in2, i8 %in3, i8 %in4, i8 %in5, i8 %in6, i8 %in7, i8 %in8, i8 %in9, i8 %in10, i8 %in11, i8 %in12, i8 %in13, i8 %in14, i8 %in15, i8 %in16, i8 %in17, i8 %in18, i8 %in19, i8 %in20) {
+  %tmp1 = add i8 %in1, %in2
+  %tmp2 = add i8 %tmp1, %in3
+  %tmp3 = add i8 %tmp2, %in4
+  %tmp4 = add i8 %tmp3, %in3
+  %tmp5 = add i8 %tmp4, %in4
+  %tmp6 = add i8 %tmp5, %in5
+  %tmp7 = add i8 %tmp6, %in6
+  %tmp8 = add i8 %tmp7, %in7
+  %tmp9 = add i8 %tmp8, %in8
+  %tmp10 = add i8 %tmp9, %in9
+  %tmp11 = add i8 %tmp10, %in10
+  %tmp12 = add i8 %tmp11, %in11
+  %tmp13 = add i8 %tmp12, %in12
+  %tmp14 = add i8 %tmp13, %in13
+  %tmp15 = add i8 %tmp14, %in14
+  %tmp16 = add i8 %tmp15, %in15
+  %tmp17 = add i8 %tmp16, %in16
+  %tmp18 = add i8 %tmp17, %in17
+  %tmp19 = add i8 %tmp18, %in18
+  %tmp20 = add i8 %tmp19, %in19
+  %tmp21 = add i8 %tmp20, %in20
+  ret i8 %tmp20
+}
+
+; Check the number of instructions reassociated is in the tens not the hundreds.
+; At the time of writing, the exact numbers were:
+; Bad order: 220 reassociate - Number of insts reassociated
+; Good order: 55 reassociate - Number of insts reassociated
+;
+; CHECK: {{^[1-9][0-9]}} reassociate - Number of insts reassociated
+
+; Additionally check that we made at least three changes.
+; CHECK:      {{^ *[3-9]}} reassociate - Number of multiplies factored




More information about the llvm-commits mailing list