[llvm] r294463 - LVI: Add a per-value worklist limit to LazyValueInfo.
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 10 09:53:47 PST 2017
ping
On 02/08/2017 09:29 PM, Philip Reames via llvm-commits wrote:
> Danny,
>
> I feel like there's something missing here. In particular, I'm not
> sure I agree with your description of the problem. See inline
> comments below. Can you help me understand why this is needed?
>
> On 02/08/2017 07:22 AM, Daniel Berlin via llvm-commits wrote:
>> Author: dannyb
>> Date: Wed Feb 8 09:22:52 2017
>> New Revision: 294463
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=294463&view=rev
>> Log:
>> LVI: Add a per-value worklist limit to LazyValueInfo.
>>
>> Summary:
>> LVI is now depth first, which is optimal for iteration strategy in
>> terms of work per call. However, the way the results get cached means
>> it can still go very badly N^2 or worse right now. The overdefined
>> cache is per-block, because LVI wants to try to get different results
>> for the same name in different blocks (IE solve the problem
>> PredicateInfo solves). This means even if we discover a value is
>> overdefined after going very deep, it doesn't cache this information,
>> causing it to end up trying to rediscover it again and again. The
>> same is true for values along the way.
> This doesn't parse for me. We have the OverdefinedCache and do cache
> the result of a (Value, BB) pair being overdefined. Other than the
> fact the cache is split - which admitted is confusing - where's the
> problem here?
>> In practice, overdefined
>> anywhere should mean overdefined everywhere (this is how, for example,
>> SCCP works).
>>
>> Until we get around to reworking the overdefined cache, we need to
>> limit the worklist size we process. Note that permanently reverting
>> the DFS strategy exploration seems the wrong strategy (temporarily
>> seems fine if we really want). BFS is clearly the wrong approach, it
>> just gets luckier on some testcases. It's also very hard to design
>> an effective throttle for BFS. For DFS, the throttle is directly related
>> to the depth of the CFG. So really deep CFGs will get cutoff, smaller
>> ones will not. As the CFG simplifies, you get better results.
>> In BFS, the limit is it's related to the fan-out times average block
>> size,
>> which is harder to reason about or make good choices for.
> I agree that a threshold for DFS makes much more sense than for BFS.
> I think having the threshold is a good idea, though I'd still like to
> understand the test case. I suspect we should be able to use a much
> higher threshold value.
>>
>> Bug being filed about the overdefined cache, but it will require major
>> surgery to fix it (plumbing predicateinfo through CVP or LVI).
>>
>> Note: I did not make this number configurable because i'm not sure
>> anyone really needs to tweak this knob. We run CVP 3 times. On the
>> testcases i have the slow ones happen in the middle, where CVP is
>> doing cleanup work other things are effective at. Over the course of
>> 3 runs, we don't see to have any real loss of performance.
> I have a very different pass pipeline than you do. Please don't
> forget that when tuning. Over-fitting the current O2 pipeline is a
> path we don't want to pursue.
>
> Did you explore other values for the cutoff? In particular, if we
> used 10,000 would your test case scale badly?
>>
>> I haven't gotten a minimized testcase yet, but just imagine in your
>> head a testcase where, going *up* the CFG, you have branches, one of
>> which leads 50000 blocks deep, and the other, to something where the
>> answer is overdefined immediately. BFS would discover the overdefined
>> faster than DFS, but do more work to do so. In practice, the right
>> answer is "once DFS discovers overdefined for a value, stop trying to
>> get more info about that value" (and so, DFS would normally cache the
>> overdefined results for every value it passed through in those 50k
>> blocks, and never do that work again. But it don't, because of the
>> naming problem)
>>
>> Reviewers: chandlerc, djasper
>>
>> Subscribers: llvm-commits
>>
>> Differential Revision: https://reviews.llvm.org/D29715
>>
>> Modified:
>> llvm/trunk/lib/Analysis/LazyValueInfo.cpp
>>
>> Modified: llvm/trunk/lib/Analysis/LazyValueInfo.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/LazyValueInfo.cpp?rev=294463&r1=294462&r2=294463&view=diff
>> ==============================================================================
>>
>> --- llvm/trunk/lib/Analysis/LazyValueInfo.cpp (original)
>> +++ llvm/trunk/lib/Analysis/LazyValueInfo.cpp Wed Feb 8 09:22:52 2017
>> @@ -39,6 +39,10 @@ using namespace PatternMatch;
>> #define DEBUG_TYPE "lazy-value-info"
>> +// This is the number of worklist items we will process to try to
>> discover an
>> +// answer for a given value.
>> +static const unsigned MaxProcessedPerValue = 500;
>> +
>> char LazyValueInfoWrapperPass::ID = 0;
>> INITIALIZE_PASS_BEGIN(LazyValueInfoWrapperPass, "lazy-value-info",
>> "Lazy Value Information Analysis", false, true)
>> @@ -563,7 +567,7 @@ namespace {
>> /// This stack holds the state of the value solver during a query.
>> /// It basically emulates the callstack of the naive
>> /// recursive value lookup process.
>> - std::stack<std::pair<BasicBlock*, Value*> > BlockValueStack;
>> + SmallVector<std::pair<BasicBlock*, Value*>, 8> BlockValueStack;
>> /// Keeps track of which block-value pairs are in
>> BlockValueStack.
>> DenseSet<std::pair<BasicBlock*, Value*> > BlockValueSet;
>> @@ -576,7 +580,7 @@ namespace {
>> DEBUG(dbgs() << "PUSH: " << *BV.second << " in " <<
>> BV.first->getName()
>> << "\n");
>> - BlockValueStack.push(BV);
>> + BlockValueStack.push_back(BV);
>> return true;
>> }
>> @@ -646,24 +650,50 @@ namespace {
>> } // end anonymous namespace
>> void LazyValueInfoImpl::solve() {
>> + SmallVector<std::pair<BasicBlock *, Value *>, 8> StartingStack(
>> + BlockValueStack.begin(), BlockValueStack.end());
>> +
>> + unsigned processedCount = 0;
>> while (!BlockValueStack.empty()) {
>> - std::pair<BasicBlock*, Value*> &e = BlockValueStack.top();
>> + processedCount++;
>> + // Abort if we have to process too many values to get a result
>> for this one.
>> + // Because of the design of the overdefined cache currently
>> being per-block
>> + // to avoid naming-related issues (IE it wants to try to give
>> different
>> + // results for the same name in different blocks), overdefined
>> results don't
>> + // get cached globally, which in turn means we will often try to
>> rediscover
>> + // the same overdefined result again and again. Once something
>> like
>> + // PredicateInfo is used in LVI or CVP, we should be able to
>> make the
>> + // overdefined cache global, and remove this throttle.
>> + if (processedCount > MaxProcessedPerValue) {
>> + DEBUG(dbgs() << "Giving up on stack because we are getting too
>> deep\n");
>> + // Fill in the original values
>> + while (!StartingStack.empty()) {
>> + std::pair<BasicBlock *, Value *> &e = StartingStack.back();
>> + TheCache.insertResult(e.second, e.first,
>> + LVILatticeVal::getOverdefined());
>> + StartingStack.pop_back();
>> + }
>> + BlockValueSet.clear();
>> + BlockValueStack.clear();
>> + return;
>> + }
>> + std::pair<BasicBlock *, Value *> &e = BlockValueStack.back();
>> assert(BlockValueSet.count(e) && "Stack value should be in
>> BlockValueSet!");
>> if (solveBlockValue(e.second, e.first)) {
>> // The work item was completely processed.
>> - assert(BlockValueStack.top() == e && "Nothing should have been
>> pushed!");
>> + assert(BlockValueStack.back() == e && "Nothing should have
>> been pushed!");
>> assert(TheCache.hasCachedValueInfo(e.second, e.first) &&
>> "Result should be in cache!");
>> DEBUG(dbgs() << "POP " << *e.second << " in " <<
>> e.first->getName()
>> << " = " <<
>> TheCache.getCachedValueInfo(e.second, e.first) << "\n");
>> - BlockValueStack.pop();
>> + BlockValueStack.pop_back();
>> BlockValueSet.erase(e);
>> } else {
>> // More work needs to be done before revisiting.
>> - assert(BlockValueStack.top() != e && "Stack should have been
>> pushed!");
>> + assert(BlockValueStack.back() != e && "Stack should have been
>> pushed!");
>> }
>> }
>> }
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list