[llvm-commits] [PATCH] Allow SelectionDAGBuilder to reorder loads past stores

Hal Finkel hfinkel at anl.gov
Wed Dec 21 12:36:14 PST 2011


For the purpose of discussion, I have attached an updated version of my
patch. This version has partially resolved the scheduling-after-call
(excess-spilling) problem by introducing an overriding heuristic into
BURRSort which prioritizes all non-loads over calls. Roughly, this
causes stuff that can be done before a call (except for loads) to be
done prior to the call. In practice, this still does not always happen
b/c of the local nature of the scheduling decisions. Surprisingly (to me
at least), this relatively simple change seems to have eliminated all of
the major performance regressions while leaving a substantial aggregate
speedup [for some code it seems better to even prioritize the loads, and
I don't understand that at all]. Nevertheless, this change has its
downsides, and I would like to discuss various alternatives to
mitigating the scheduling problems caused by call-induced spilling.

I consider this patch to be for discussion only b/c I have not included
any regression-test updates, and there still is a test-suite failure on
x86_64 (and likely on other platforms as well) that need to be resolved.

Please let me know what you think.

Thanks again,
Hal

On Wed, 2011-12-21 at 07:49 -0600, Hal Finkel wrote:
> It turns out that a significant cause of the performance regressions
> caused by this patch are related to this issue: with the patch applied
> the scheduler is now free to schedule many more things, especially
> stores, after calls (especially intrinsics that are expanded to lib
> calls). This tendency is bad because of the spilling necessary to cross
> the call boundary. I am working on a proposed solution, and I'll post an
> updated patch soon.
> 
> Thanks again,
> Hal
> 
> On Tue, 2011-12-20 at 12:52 -0600, Hal Finkel wrote:
> > On Tue, 2011-12-20 at 10:44 -0800, Jakob Stoklund Olesen wrote:
> > > On Dec 20, 2011, at 9:22 AM, Hal Finkel wrote:
> > > 
> > > > when I later look at the register map, only XMM0 and XMM1 are ever
> > > > assigned to vregs, everything else is spilled. This is wrong. Do you
> > > > have any ideas on what could be going wrong or other things I should
> > > > examine? Could the register allocator not be accounting correctly for
> > > > callee-saved registers when computing live-interval interference
> > > > information?
> > > 
> > > There are no callee-saved xmm registers.
> > 
> > Thanks! I was mixing up the Win64 calling convention with the regular
> > one. That explains things, so, I suppose the right thing to do is to
> > make sure all stores are flushed before any call (which I think it
> > already does), and any intrinsic that will be expanded (which it will
> > not currently do).
> > 
> >  -Hal
> > 
> > > 
> > > /jakob
> > > 
> > 
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm_lsro-20111221.diff
Type: text/x-patch
Size: 23163 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111221/e5c5a307/attachment.bin>


More information about the llvm-commits mailing list