[llvm-dev] RFC: speedups with instruction side-data (ADCE, perhaps others?)

escha via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 14 10:29:41 PDT 2015


> On Sep 14, 2015, at 9:37 AM, escha via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> I’ve been playing around with optimizing performance various passes and noticed something about ADCE: it keeps an Alive set that requires a lot of upkeep time-wise, but if each instruction had a /single bit/ of side data (to represent liveness, solely within the pass), the set wouldn’t be needed. I tested this out and observed a ~1/3 reduction in time in ADCE: 1454ms to 982ms according to a profile over our test suite (for a total of 0.6% compile time savings in our pipeline).
> 
> Are there any other passes that could benefit from having a single bit (or similarly small amount) of per-Instruction metadata local to the pass, i.e. to avoid having to keep a side-Set of instructions that’s only used to test membership of the set (i.e. not iterated over)? Is this sort of thing something reasonable (maybe it could be stuffed in the SubclassData short and exposed via a public API?)

Here’s a simple prototype of the kind of thing I mean:

diff --git a/include/llvm/IR/Value.h b/include/llvm/IR/Value.h
index 9d83cef..d48f8f3 100644
--- a/include/llvm/IR/Value.h
+++ b/include/llvm/IR/Value.h
@@ -104,13 +104,17 @@ protected:
   ///
   /// Note, this should *NOT* be used directly by any class other than User.
   /// User uses this value to find the Use list.
-  enum : unsigned { NumUserOperandsBits = 29 };
+  enum : unsigned { NumUserOperandsBits = 28 };
   unsigned NumUserOperands : NumUserOperandsBits;
 
   bool IsUsedByMD : 1;
   bool HasName : 1;
   bool HasHungOffUses : 1;
 
+  /// \brief A 1-bit marker that can be used freely within passes, e.g. for
+  /// mark-sweep algorithms like ADCE, to avoid keeping a side data set.
+  bool Marker : 1;
+
 private:
   template <typename UseT> // UseT == 'Use' or 'const Use'
   class use_iterator_impl
@@ -388,6 +392,12 @@ public:
   /// \brief Return true if there is a value handle associated with this value.
   bool hasValueHandle() const { return HasValueHandle; }
 
+  /// \brief Return the current value of the Marker.
+  bool getMarker() const { return Marker; }
+
+  /// \brief Set the current value of the Marker.
+  void setMarker(bool Val) { Marker = Val; }
+
   /// \brief Return true if there is metadata referencing this value.
   bool isUsedByMetadata() const { return IsUsedByMD; }
 
diff --git a/lib/Transforms/Scalar/ADCE.cpp b/lib/Transforms/Scalar/ADCE.cpp
index 96a0107..5578860 100644
--- a/lib/Transforms/Scalar/ADCE.cpp
+++ b/lib/Transforms/Scalar/ADCE.cpp
@@ -53,15 +53,17 @@ bool ADCE::runOnFunction(Function& F) {
   if (skipOptnoneFunction(F))
     return false;
 
-  SmallPtrSet<Instruction*, 128> Alive;
+ // SmallPtrSet<Instruction*, 128> Alive;
   SmallVector<Instruction*, 128> Worklist;
 
   // Collect the set of "root" instructions that are known live.
   for (Instruction &I : instructions(F)) {
     if (isa<TerminatorInst>(I) || isa<DbgInfoIntrinsic>(I) || I.isEHPad() ||
         I.mayHaveSideEffects()) {
-      Alive.insert(&I);
+      I.setMarker(true);
       Worklist.push_back(&I);
+    } else {
+      I.setMarker(false);
     }
   }
 
@@ -70,8 +72,10 @@ bool ADCE::runOnFunction(Function& F) {
     Instruction *Curr = Worklist.pop_back_val();
     for (Use &OI : Curr->operands()) {
       if (Instruction *Inst = dyn_cast<Instruction>(OI))
-        if (Alive.insert(Inst).second)
+        if (!Inst->getMarker()) {
           Worklist.push_back(Inst);
+          Inst->setMarker(true);
+        }
     }
   }
 
@@ -80,7 +84,7 @@ bool ADCE::runOnFunction(Function& F) {
   // value of the function, and may therefore be deleted safely.
   // NOTE: We reuse the Worklist vector here for memory efficiency.
   for (Instruction &I : instructions(F)) {
-    if (!Alive.count(&I)) {
+    if (!I.getMarker()) {
       Worklist.push_back(&I);
       I.dropAllReferences();
     }



More information about the llvm-dev mailing list