[PATCH] D16795: WholeProgramDevirt: introduce.
Pete Cooper via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 5 19:21:30 PST 2016
pete accepted this revision.
pete added a comment.
This revision is now accepted and ready to land.
In http://reviews.llvm.org/D16795#343746, @pcc wrote:
> - Remove llvm:: in a few places
> - Parens
> - Better casting
> - Move code into functions; fix some non-determinism
> - Use DenseSet
> - Simplify a little
> - Add negative tests
Thanks for all of those. They look good.
Regarding this one:
> - We can't use RecursivelyDeleteTriviallyDeadInstructions
I hadn't realized you were caching the found calls, so makes sense you can't delete them with this. Thanks for the comment in the code saying so.
Also, thanks for the extensive tests.
I've now gone through the whole patch. I was able to follow along very well so kudos for the good comments and sensible naming on everything.
I don't think there's anything i've commented on here that needs further review. Please fix any comments I had which you feel are reasonable, ignore those which should be ignored, and LGTM :)
================
Comment at: lib/Transforms/IPO/PassManagerBuilder.cpp:640
@@ -639,1 +639,3 @@
+void PassManagerBuilder::addEarlyLTOOptimizationPasses(
+ legacy::PassManagerBase &PM) {
----------------
Thanks for the data. 3s isn't bad at all considering how big that Module is likely to be.
I'm going to take a look at what this does to LLVM itself when we link only a single backend, but my curiously there shouldn't block further progress on this.
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:113
@@ +112,3 @@
+ // Find a free (Size/8) byte region in each member of Used.
+ // FIXME: see if alignment helps.
+ for (unsigned I = 0;; ++I) {
----------------
Didn't realize that was the clang-format behavior. No problem keeping it the way it is then :)
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:208-210
@@ +207,5 @@
+ CS->replaceAllUsesWith(New);
+ if (auto II = dyn_cast<InvokeInst>(CS.getInstruction())) {
+ BranchInst::Create(II->getUnwindDest(), CS.getInstruction());
+ II->getUnwindDest()->removePredecessor(II->getParent());
+ }
----------------
I don't tend to work with InvokeInst much. I trust that what you're doing here is correct, but please add a test to cover it just to make sure it keeps being ok in future.
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:295
@@ +294,3 @@
+ // Take into account the GEP offset.
+ if (GEP->hasAllConstantIndices() && VPtr == GEP->getPointerOperand()) {
+ SmallVector<Value *, 8> Indices(GEP->op_begin() + 1, GEP->op_end());
----------------
Nitpicking: Please swap the order here as checking 'VPtr == GEP->getPointerOperand()' is faster than GEP->hasAllConstantIndices()
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:296-298
@@ +295,5 @@
+ if (GEP->hasAllConstantIndices() && VPtr == GEP->getPointerOperand()) {
+ SmallVector<Value *, 8> Indices(GEP->op_begin() + 1, GEP->op_end());
+ uint64_t GEPOffset = M->getDataLayout().getIndexedOffsetInType(
+ GEP->getSourceElementType(), Indices);
+ findLoadCallsAtConstantOffset(BitSet, User, Offset + GEPOffset, VTable);
----------------
This is pretty gross (not your fault! I looked and most other places do exactly the same thing)
But I did see one that was much nicer. Its in ConstantFolding.cpp. Can you make yours similar? I might go back and fix the others in tree.
Note, it was getting a range from a vector of indices, but I think it might be possible to do the equivalent with op_begin() + 1 and op_end().
makeArrayRef((Value * const *)Ops.data() + 1, Ops.size() - 1)));
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:586
@@ +585,3 @@
+
+ auto BitSet = cast<MetadataAsValue>(CI->getArgOperand(1))->getMetadata();
+
----------------
Can you move this in to the 'if (!Assumes.empty()) {' below as I think its only used in there.
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:615
@@ +614,3 @@
+ DenseMap<Metadata *, std::set<BitSetInfo>> BitSets;
+ NamedMDNode *BitSetNM = M->getNamedMetadata("llvm.bitsets");
+ std::vector<VTableBits> Bits;
----------------
Can we test for this being non-null as the very first thing in runModule? Or do you want to erase all the assume's as in the previous loop, even if you can't find the metadata?
If you do keep the conditional here, looks like you could move BitSets inside the scope, as well as moving the "// For each (bitset, offset) pair:" loop to the end of the conditional block
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:630-631
@@ +629,4 @@
+ auto OpGlobal = dyn_cast<GlobalVariable>(OpConst);
+ if (!OpGlobal)
+ continue;
+
----------------
In addition to these checks, it looks like BitSets is only used by tryFindVirtualCallTargets, and the first thing we do in there is check the following:
if (!BS.Bits->GV->isConstant())
return false;
auto Init = dyn_cast<ConstantArray>(BS.Bits->GV->getInitializer());
if (!Init)
return false;
Can we move those checks here so that we don't even populate BitSets[] entries unless these properties hold?
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:662-669
@@ +661,10 @@
+
+ tryVirtualConstProp(TargetsForSlot, S.second);
+ }
+
+ // Rebuild each global we touched as part of virtual constant propagation to
+ // include the before and after bytes.
+ for (VTableBits &B : Bits) {
+ if (B.Before.Bytes.empty() && B.After.Bytes.empty())
+ continue;
+
----------------
I think you could have a 'bool Changed' to track if tryVirtualConstProp ever did anything and only run this loop if thats the case. Is that right?
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:677-679
@@ +676,5 @@
+ // Before was stored in reverse order; flip it now.
+ for (unsigned I = 0; I != B.Before.Bytes.size() / 2; ++I)
+ std::swap(B.Before.Bytes[I],
+ B.Before.Bytes[B.Before.Bytes.size() - 1 - I]);
+
----------------
Please do something like 'I = 0, Size = B.Before.Bytes.Size()' so that you can avoid reevaluating the size on each iteration and inside the loop body
================
Comment at: lib/Transforms/IPO/WholeProgramDevirt.cpp:702
@@ +701,3 @@
+ Alias->setVisibility(B.GV->getVisibility());
+ Alias->takeName(B.GV);
+
----------------
No idea if this is possible, so feel free to ignore, but if the original VT had an explicit section on it, would either NewGV or Alias here need to inherit that section? I have no idea what the correct answer would be.
For safety sake, we could always exclude VTables with explicit sections from this optimization until we have a good answer.
Either way, please add a test with an explicit section and make sure we either propagate it as required, or don't run this optimization on it.
http://reviews.llvm.org/D16795
More information about the llvm-commits
mailing list