[llvm-commits] PATCH: Expose the merge interface for LiveInterval and use it to speed up a few hot paths in LiveIntervalAnalysis

Sun Jul 15 15:00:43 PDT 2012

On Sun, Jul 15, 2012 at 11:43 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:

>
> On Jul 15, 2012, at 5:00 AM, Chandler Carruth <chandlerc at gmail.com> wrote:
>
> > These two callsites to addRange showed up as fairly hot, taking about
> 10% of some executions with many basic blocks and variables.
>
> You mean the asan test case, right?

Yes, but only after I've switched it to produce more "normal" looking
functions. They now look essentially the same as metaprogramming-unrolled
loops coming out of uses of Boost and other C++ libraries.

> It is important to pay attention to the performance of typical
> compilations which look very different.
>

Very aware. I'm working on a bit of infrastructure here that will let me
quickly test a patch out against our entire codebase and see if either the
average time to compile or the shape of the distribution of compile times
changes. I'm happy to hold back from too many more patches until that is
done. We're planning on also running it nightly to track changes over time.

> > Exposing the merge interface and using it to batch each loop makes them
> vanish from the profile, dropping compilation time on one address sanitizer
> test case by 10%.
>
> +    /// \brief Merge the specified array of LiveRanges into this interval.
> +    ///
> +    /// This merges with existing ranges as appropriate. Clients should
> use
> +    /// this interface if potentially many ranges are going to be merged
> in
> +    /// sequence rather than calling addRange repeatedly.
> +    void mergeRanges(ArrayRef<LiveRange> Ranges);
>
> The problem is, this interface exposes the innards of the LiveInterval
> representation and the merge algorithm. It doesn't fit the callers well,
> forcing them to copy their data structures just to fit the API. (And then
> the merge makes another copy).
>

My expectation is that in the small case, the lack of allocation makes
these copies not an issue, and in the large case, the actual merge is the
dominant issue. The profiles I've seen thus far seem to back that up. I
also don't think a batching interface is a huge imposition on the callers.
It added two lines of code in both cases?

It would be much better to wrap the merge algorithm state in a class with
> an add(start, end, valno) function.
>
> That would also help solve the second problem: The ranges to merge must be
> sorted, but the callers only have almost-sorted lists. SlotIndex numbering
> follows layout which is not the same as MBB numbering. The kill lists are
> probably layout-ordered most of the time, but it is not an invariant being
> maintained but the LV updaters.
>
> A merge algorithm class could easily detect ordering violations and
> restart itself. Restarting the merge is no slower than calling addRange(),
> and it will be very effective for the almost-sorted cases.

I actually thought about this, but I think that ultimately a batch
interface is *simpler* than an entire separate class to manage merging. The
separate class would introduce state between calls to the LiveInterval, or
would require exposing an RAII object to manage the process. The latter
would probably be the cleanest, but would impose a further burden on
callers to have a scope to represent the batch. My typical feelings are
that as long as a batch interface consists of an input array, perhaps with
order requirements, it's simpler to just expose the batch interface. When
the batching requires still more logic or data structures to represent,
it's better to expose a dedicated system for it.

Ultimately, I'm not sure its worth it to avoid the coupling here. It does
not seem really that bad that LiveIntervalAnalysis is aware and takes
advantage of the LiveInterval representation to achieve efficiency.
Consider the simplicity of my patch versus the amount of code that would
have to be added to support a merge class.

That said, clearly the almost-sorted order is a problem. I actually had a
pile of test failures sitting in front of me that I failed to look at
before hitting the 'send' button, my bad there. I suspect that a simple and
fast solution is merely to call 'std::sort' on the vector and document the
requirement.

Anyways, let me know if you find my arguments about batch interfaces versus
a merge class unconvincing. ;] I'm not in any hurry with this patch as it's
just a 10% or 20% drop in very edge-case compilations.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120715/18a20f4d/attachment.html>