[llvm-commits] [PATCH] Allow SelectionDAGBuilder to reorder loads past stores

Wed Dec 21 08:44:22 PST 2011

Hal, 

  I have actually done the same fix internally (couple months ago) which
also resulted in severe performance degradation. To solve it for our back
end (Hexagon) I ended up modifying the scheduler. In fact I have introduced
our own (calling it VLIW) scheduler to handle newly available parallelism
and resulting reg pressure. Result was significant overall performance gain
on a wide (internal) test suite, with some kernels gaining 40-60%. I tried
to accomplish the same with existing infrastructure, but failed. Now you are
seeing similar issue with another architecture. I really wonder what your
next move shell be.
  I have not checked my changes in for a simple reason that we are not
caught with the LLVM tip in our internal repository (we are several months
behind), and Evan has changed the game rules enough (I mean the removal of
top-down schedulers) ...for my design to be incompatible with the tip (my
scheduler is top-down).
  I still plan to submit my work, but it needs to be changed it first, and
that takes time. 
  Finally, what I am trying to say - if you are interested in what I have
been doing, or you know a better solution for the problem within existing
infrastructure, I would be very interested in talking about it. 

Thanks.

Sergei Larin

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.

> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
> bounces at cs.uiuc.edu] On Behalf Of Hal Finkel
> Sent: Wednesday, December 21, 2011 7:50 AM
> To: Jakob Stoklund Olesen
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH] Allow SelectionDAGBuilder to
> reorder loads past stores
> 
> It turns out that a significant cause of the performance regressions
> caused by this patch are related to this issue: with the patch applied
> the scheduler is now free to schedule many more things, especially
> stores, after calls (especially intrinsics that are expanded to lib
> calls). This tendency is bad because of the spilling necessary to cross
> the call boundary. I am working on a proposed solution, and I'll post
> an
> updated patch soon.
> 
> Thanks again,
> Hal
> 
> On Tue, 2011-12-20 at 12:52 -0600, Hal Finkel wrote:
> > On Tue, 2011-12-20 at 10:44 -0800, Jakob Stoklund Olesen wrote:
> > > On Dec 20, 2011, at 9:22 AM, Hal Finkel wrote:
> > >
> > > > when I later look at the register map, only XMM0 and XMM1 are
> ever
> > > > assigned to vregs, everything else is spilled. This is wrong. Do
> you
> > > > have any ideas on what could be going wrong or other things I
> should
> > > > examine? Could the register allocator not be accounting correctly
> for
> > > > callee-saved registers when computing live-interval interference
> > > > information?
> > >
> > > There are no callee-saved xmm registers.
> >
> > Thanks! I was mixing up the Win64 calling convention with the regular
> > one. That explains things, so, I suppose the right thing to do is to
> > make sure all stores are flushed before any call (which I think it
> > already does), and any intrinsic that will be expanded (which it will
> > not currently do).
> >
> >  -Hal
> >
> > >
> > > /jakob
> > >
> >
> 
> --
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits