[PATCH] A bit-tracking DCE Pass

Mon Feb 16 10:21:35 PST 2015

In http://reviews.llvm.org/D7531#124045, @chandlerc wrote:

> Lots of nit-picky comments. Generally, this looks really nice. Have you profiled it? Have you tried to synthesize extreme test cases with very large integers to make sure this doesn't become *too* expensive?

I've not profiled it in detail; I've checked compile-time overhead on a few key test cases (like sqlite), and saw nothing significant.

> I'm generally fine just landing this and sorting out these issues after the fact. Please commit, and sorry for the delay reviewing it.

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:45
@@ +44,3 @@
+namespace {
+  struct BDCE : public FunctionPass {
+    static char ID; // Pass identification, replacement for typeid
----------------
chandlerc wrote:
> clang-format doesn't indent things in an anonymous namespace, and I prefer that.
Will do (corresponding change made to ADCE in r229412).

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:94
@@ +93,3 @@
+    KnownOne =  APInt(BitWidth, 0);
+    computeKnownBits(const_cast<Value*>(V1), KnownZero, KnownOne, DL, 0, AC,
+                     UserI, DT);
----------------
chandlerc wrote:
> Why accept const inputs just to cast it way?
To keep the casts localized to one place.

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:97-99
@@ +96,5 @@
+
+    if (V2) {
+      KnownZero2 = APInt(BitWidth, 0);
+      KnownOne2 =  APInt(BitWidth, 0);
+      computeKnownBits(const_cast<Value*>(V2), KnownZero2, KnownOne2, DL, 0, AC,
----------------
chandlerc wrote:
> This behavior doesn't really make sense. Maybe add a comment to the lambda to clarify what V1 and V2 are and why they would work this way?
Will do. The general idea is that, we don't want to repeat the known-bits calculation for both operands at each operand.

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:258-259
@@ +257,4 @@
+  AC = &getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
+  DataLayoutPass *DLP = getAnalysisIfAvailable<DataLayoutPass>();
+  DL = DLP ? &DLP->getDataLayout() : nullptr;
+  DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
----------------
chandlerc wrote:
> Just dig it out of the module, simpler than using the analysis layer.
Will do.

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:278-279
@@ +277,4 @@
+    if (IntegerType *IT = dyn_cast<IntegerType>(I.getType())) {
+      AliveBits[&I] = APInt(IT->getBitWidth(), 0);
+      Worklist.push_back(&I);
+    } else {
----------------
chandlerc wrote:
> continue after this?
Sure.

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:341
@@ +340,3 @@
+          if (ABNew != ABPrev || ABI == AliveBits.end()) {
+            AliveBits[I] = ABNew;
+            Worklist.push_back(I);
----------------
chandlerc wrote:
> std::move?
> 
> Probably a few other places you might want to move here so that large APInt values don't trigger too many copies of a heap allocated set of bits.
Good point.

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:365-367
@@ +364,5 @@
+          DEBUG(dbgs() << "BDCE: Trivializing: " << I << " (all bits dead)\n");
+          Value *Zero = ConstantInt::get(I.getType(), 0);
+          ++NumSimplified;
+          I.replaceAllUsesWith(Zero);
+          Changed = true;
----------------
chandlerc wrote:
> Why not replace with undef?
That's probably the right thing to do. However, given some of the discussion on poison/undef semantics, I'm not sure I want to do that. I've tried to stay away from cases here where the instructions themselves are trivial (like icmp sgt x, INT_MAX), but we *could* handle them here, and I'd like to test that change separately.

================
Comment at: lib/Transforms/Scalar/BDCE.cpp:380-382
@@ +379,5 @@
+
+    DEBUG(dbgs() << "BDCE: Removing: " << I << " (unused)\n");
+    Worklist.push_back(&I);
+    I.dropAllReferences();
+    Changed = true;
----------------
chandlerc wrote:
> Why not just erase the instructions? You can use the two-level iterators and pre-increment the iterator to avoid invalidation...
For the same reason that ADCE doesn't. The instructions themselves might not be trivially dead (there might be cyclic dependencies on other instructions that we'll visit here later, and the entire cycle will be removed). That's why this is done in two stages.

http://reviews.llvm.org/D7531

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/