[PATCH] D58098: [IR] Add Use::moveToFrontOfUseList() method

Tue Feb 19 16:33:55 PST 2019

rtereshin marked 5 inline comments as done.
rtereshin added a comment.

In D58098#1403098 <https://reviews.llvm.org/D58098#1403098>, @dexonsmith wrote:

> This seems like it could be useful.  Do you have specific places in mind where you'd take advantage of it?

Hi Duncan,

thank you for looking into this!

Yes, I have two places in mind, one is pre-existing (https://reviews.llvm.org/D58100) and another is relatively new (https://reviews.llvm.org/D58101). It'd be great if we could find more, but I might need help with that: grepping takes me only so far. Do you have any ideas?

Thanks,
Roman

================
Comment at: unittests/IR/UseTest.cpp:114-118
+  const char *ModuleString = "define void @f(i32 %x) {\n"
+                             "entry:\n"
+                             "  %v0 = inttoptr i32 %x to i8*\n"
+                             "  ret void\n"
+                             "}\n";
----------------
dexonsmith wrote:
> This might be simpler to specify with a raw string literal.  You could also make this large enough to be interesting and specify the use-list, via `uselistorder` directives (see: http://llvm.org/docs/LangRef.html#use-list-order-directives).
> 
I'll try and play with that a bit more. I wasn't sure if what you mean is possible without technically breaking indentation and if it's okay or not. And I didn't know about use list order directives at all!

================
Comment at: unittests/IR/UseTest.cpp:131
+  Instruction *Ret = F->getEntryBlock().getTerminator();
+  constexpr unsigned Size = 20 * 1000;
+  for (unsigned I = 1; I <= Size; ++I) {
----------------
dexonsmith wrote:
> This seems really big.  What extra coverage are you getting from this?
> 
> Generally, there seems to be a lot of permutation here that is unrelated to the new API.  It'll take time to run.  What value is it adding?
For some time I had that idea that it could be useful if we had "compile time regression tests", which are also reliable and fast. I think it's possible when a speed up achieved is asymptotically significant, like going from O(n^2) algorithm to O(n) one. If the n is big enough and the constant factor is small enough, it may be possible to design a test which is very fast even when ran on very slow machines (bots or whatnot) if the compile time problem is still in check, but really, really slow even on fastest machines if there is a regression.

This test takes about 50 ms to run if `moveToFrontOfUseList` does what's expected, as well as the other operations (like adding a new use) behave like they currently do (inserting new uses in the beginning of the list, for instance). If any of it breaks apart, this test takes almost 2 minutes to run (before it fails with one of the google-tests assertions that check the values of the "performance counters", which in this case specifically are possible).

Please let me know what do you think!

================
Comment at: unittests/IR/UseTest.cpp:134
+    auto *Const = ConstantInt::get(X.getType(), I);
+    BinaryOperator::Create(Instruction::Add, &X, Const, "v" + Twine(I), Ret);
+  }
----------------
dexonsmith wrote:
> Do you really need to generate IR here?
I would like to do so, yes. How well code like below performs depends on what `moveToFrontOfUseList` and the rest of the methods do in collaboration. For instance, creating a new user for a value puts that user in the beginning of the use list.

I'd like to demonstrate here that even in that case the lookup cost doesn't blow up: new uses move the "preferred" use away from the beginning of the list, but only by K positions, where K - is the number of new users created. Also, such a configuration will only add K extra iterations once, as the "preferred" use will be moved to the beginning of the use list again. Therefore we can say that this extra cost K is amortized by the cost of creating (or even assigning by `User::setOperand` or `Use::set`) K new users: each new user takes at least O(1) to create and we can just say that one extra iteration over the user list is a part of that constant cost.

I don't wan't any of it to depend on how exactly the use lists are formed and sorted by IR parsing methods: I think the interoperability of the explicit C++ APIs are more important than that.

Also, this example very closely follows the real world use case I have in mind (https://reviews.llvm.org/D58101) and I like them being as close as possible. That use case creates new users of an instruction in a fashion similar to this test.

================
Comment at: unittests/IR/UseTest.cpp:152
+  constexpr unsigned NumRepetitions = 100;
+  for (unsigned J = 0; J < NumRepetitions; ++J)
+    for (Instruction &I : F->getEntryBlock())
----------------
dexonsmith wrote:
> Why do you have so many repetitions?
Explained (hopefully) in the other reply.

================
Comment at: unittests/IR/UseTest.cpp:163
+        }
+  ASSERT_EQ(Size * NumRepetitions, FoundCounter);
+  ASSERT_EQ(Size * NumRepetitions + Size, CheckCounter);
----------------
dexonsmith wrote:
> You've run a lot of tests, and are only checking the result at the end.  It will be hard to debug if this every goes wrong.  What's the key thing you're trying to test here?
I'm interested in the overall number of iterations required, not so much in their exact profile (in other words, in amortized cost, not the worst / best cases).

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58098/new/

https://reviews.llvm.org/D58098