[PATCH] D87063: [BitcodeReader] Fix O(N^2) in placeholder replacement algorithm.

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 22 13:59:17 PDT 2020


dblaikie accepted this revision.
dblaikie added a comment.
This revision is now accepted and ready to land.

Think I've got it - sorry for the delay & hopefully my rambling notes (if you want to have a read) at least provide a little check on the work - but I hardly feel the most qualified in this area.



================
Comment at: llvm/lib/Bitcode/Reader/ValueList.cpp:152-153
 void BitcodeReaderValueList::resolveConstantForwardRefs() {
-  // Sort the values by-pointer so that they are efficient to look up with a
-  // binary search.
-  llvm::sort(ResolveConstants);
-
+  DenseMap<Constant *, Constant *> ConstantMapping;
+  ConstantMapping.reserve(ResolveConstants.size());
+
----------------
I'd consider moving this down closer to the scope it's used, perhaps?


================
Comment at: llvm/lib/Bitcode/Reader/ValueList.cpp:159-172
+  auto CountOperandsOfUsers = [&](Constant *C) {
+    for (Use &ConstantUse : C->uses()) {
+      auto *ConstantUser = dyn_cast<Constant>(ConstantUse.getUser());
+      if (ConstantUser && !isa<GlobalValue>(ConstantUser)) {
+        unsigned NumOperandsFound = ++NumRewrittenOperands[ConstantUser];
+        if (NumOperandsFound == 1)
+          RewriteWorklist.push_back(ConstantUser);
----------------
So if I understand this correctly - we might have, say 3 forward references ("ResolveConstants" - would that variable be more aptly named "UnresolvedConstants"? (just clarifying my own understanding, not a request/need to change if it doesn't seem generally worthwhile)) - but there might be constants without any forward references but they do have references to constants that have forward references - and so on. So, say ResolveConstants is {A, B, C} but some already resolved constants are {X, Y, Z}. And the operands are Y(X(A(C),B)) and Z references nothing (or at least no other constants (that aren't also GlobalValues, I guess)).

Placeholders (in ResolveConstant) don't actually use anything, I guess? (because they're placeholders/haven't been filled out yet)

So the 169 for loop goes through {A, B, C}:
Constant: A -> NumRewrittenOperands = {X:1}, RewriteWorklist = {X}
Constant: B -> NumRewrittenOperands = {X:2}, RewriteWorklist = {X}
Constant: C -> NumRewrittenOperands = {X:2}, RewriteWorklist = {X}

Then the loop on 171:
Constant: X -> NumRewrittenOperands = {X:2, Y:1}, RewriteWorklist = {Y}
Constant: Y -> NumRewrittenOperands = {X:2, Y:1}, RewriteWorklist = {}

Looking further down, I see that we might find resolved constants during this walk. So, for instance when visiting the uses of C we might find A', the resolved version of A? I would've thought if that was the case then we'd end up double-counting X's NumRewrittenOperands, for instance - but I guess X is only using either A or A', not both. So A' gets built, but it isn't inserted yet... which is the whole point of all this, right right. So if we did come across A' like that, we'd put A' in NumRewrittenOperands:1, A' would go in the RewriteWorklist, but have no effect since no one is using it - they're all using A.

So let's say we endu p with NumRewrittenOperands = {X:2, Y:1, A':1}


================
Comment at: llvm/lib/Bitcode/Reader/ValueList.cpp:196-206
+  for (auto ResolveConstant : ResolveConstants) {
+    Constant *MappedC = cast<Constant>(operator[](ResolveConstant.second));
+    if (NumRewrittenOperands.lookup(MappedC)) {
+      // We have a placeholder that resolves to a constant that needs to
+      // be rewritten; note it so we can process it later.
+      ConstToPlaceholders[MappedC].push_back(ResolveConstant.first);
+    } else {
----------------
So here we visit {A, B, C} again. Find their mapped equivalents, and build an ordered RewriteWorklist... (RewriteWorklist is the same variable, but is being used independently here, since it was cleared out by the loop at 171 above)

A, A' -> ConstToPlaceholders = {A':[A]}, NumRewrittenOperands = {X:2, Y:1, A':1}, RewriteWorklist = {}, ConstantMapping = {}
B, B' -> ConstToPlaceholders = {A':[A]}, NumRewrittenOperands = {X:1, Y:1, A':1}, RewriteWorklist = {}, ConstantMapping = {B:B'}, RewrittenConstants = {B}
C, C' -> ConstToPlaceholders = {A':[A]}, NumRewrittenOperands = {X:1, Y:1}, RewriteWorklist = {A'}, ConstantMapping = {B:B', C:C'}, RewrittenConstants = {B, C}



================
Comment at: llvm/lib/Bitcode/Reader/ValueList.cpp:210-218
+  while (!RewriteWorklist.empty()) {
+    Constant *C = RewriteWorklist.pop_back_val();
 
-      // Make the new constant.
-      Constant *NewC;
-      if (ConstantArray *UserCA = dyn_cast<ConstantArray>(UserC)) {
-        NewC = ConstantArray::get(UserCA->getType(), NewOps);
-      } else if (ConstantStruct *UserCS = dyn_cast<ConstantStruct>(UserC)) {
-        NewC = ConstantStruct::get(UserCS->getType(), NewOps);
-      } else if (isa<ConstantVector>(UserC)) {
-        NewC = ConstantVector::get(NewOps);
-      } else {
-        assert(isa<ConstantExpr>(UserC) && "Must be a ConstantExpr.");
-        NewC = cast<ConstantExpr>(UserC)->getWithOperands(NewOps);
-      }
+    // Add this constant, and placeholders that refer to it, to the sorted
+    // order. Add its uses to the worklist.
+    AddConstantToOrder(C);
+    for (Constant *Placeholder : ConstToPlaceholders.lookup(C))
----------------
Constant: A', RewriteWorklist = {}
  AddConstantToOrder(A') -> NumRewrittenOperands = {X:1, Y:1}, RewrittenConstants = {B, C, A'}
    AddConstantToOrder(A) -> NumRewrittenOperands = {Y:1}, RewrittenConstants = {B, C, A', A}, RewriteWorklist = {X}
Constant: X, RewriteWorklist = {}
  AddConstantToOrder(X) -> NumRewrittenOperands = {Y:1}, RewrittenConstants = {B, C, A', A, X}, RewriteWorklist = {Y}
Constant: Y, RewriteWorklist = {}
  AddConstantToOrder(Y) -> NumRewrittenOperands = {}, RewrittenConstants = {B, C, A', A, X, Y}, RewriteWorklist = {}


================
Comment at: llvm/lib/Bitcode/Reader/ValueList.cpp:222-254
+  for (Constant *C : RewrittenConstants) {
+    if (isa<ConstantPlaceHolder>(C)) {
+      assert(ConstantMapping.lookup(C) && "Should already be mapped");
+      continue;
+    }
 
+    // Remap the constant's operands.
----------------
RewrittenConstants = {B, C, A', A, X, Y}, ConstantMapping = {B:B', C:C'},

Loop:
Constant: B -> Placeholder skip
Constant: C -> Placeholder skip
Constant: A' -> ConstantMapping = {B:B', C:C', A':A'', A:A''}
Constant: A -> Placeholder skip
Constant: X -> ConstantMapping = {B:B', C:C', A':A'', X, X'} (mapping A and B to make X'(A'', B'))
Constant: Y -> ConstantMapping = {B:B', C:C', A':A'', X, X', Y:Y'} (mapping X to make Y'(X'(A'', B')))



================
Comment at: llvm/lib/Bitcode/Reader/ValueList.cpp:256-257
+
+  // Rewrite the uses of each constant.  The topological sort ensures we never
+  // RAUW a constant that's used by other constants.
+  for (Constant *C : reverse(RewrittenConstants)) {
----------------
Ah, this is because if we RAUW, say, C first - it'd replace the use of C in A', but we're going to throw away A' (line 266) anyway in favor of A''?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87063/new/

https://reviews.llvm.org/D87063



More information about the llvm-commits mailing list