[PATCH] D21429: Make the insertion of predicate deps in the schedule graph not quadratic in the number of predicate deps.
Chandler Carruth via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 16 03:37:31 PDT 2016
chandlerc created this revision.
chandlerc added a subscriber: llvm-commits.
Herald added subscribers: mcrosier, MatzeB.
Before this we would do a linear scan of all existing predicate deps to
check for overlap. When inserting N non-duplicate predicate deps,
this trivially has O(N^2).
Unfortunately, there are two different definitions of "overlap" used so
a single map is insufficient. Instead, we use an index map and a count
map. The the count map handles the preds that aren't required if there
is an exitsing edge, and the index map allows us to find an exact
existing dep and update it in constant time w.r.t. the number of
This doesn't fix the linear scan over *successor* deps, but for whatever
reason, even with the *insane* schedule graph formed by
test/CodeGen/AMDGPU/spill-scavenge-offset.ll, I can't see that really
hot on my profile. If I can find a way to make that show up, I'll look
at fixing that linear scan as well.
However, one possible reason I can't see this is because fixing this
quadratic behavior immediately uncovers a second quadratic behavior. I'm
going to try to fix that next.
This fix alone is good for a 20% to 55% speed up in the above test case
prior to r272860 which somewhat avoided triggering this quadratic
behavior.. A debug build for me drops from 40s to 32s for the entire
test, and an optimized build from 6s to 4.5s. This shaves about 4s off
of my 'ninja cehck-llvm' time in debug builds where this test is one of
the tall poles. I had really hoped for more dramatic improvements but
there appears to be too much overhead and too many other quadratic
things going on...
Given that this requires two map data structures, one of which with
a decidedly non-trivial key, I'm not 100% certain this the best
approach. It would be so much nicer to have tiered structures from SUnit
to the set of deps on that edge... But that looks like a much more
invasive change. Thoughts? Personally, I still lean toward not having
a quadratic algorithm. =]
Note that these quadratic algorithms impact both the SDAG scheduling and
MI scheduling because both build the Schedule DAG. =[ =[ =[ =[
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 8752 bytes
Desc: not available
More information about the llvm-commits