[llvm] r204076 - Use range metadata instead of introducing selects.

Mon Apr 7 17:09:25 PDT 2014

On Mar 25, 2014, at 10:50 AM, Dan Gohman <dan433584 at gmail.com> wrote:

> 
> 
> 
> On Tue, Mar 25, 2014 at 7:24 AM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> On 25 March 2014 09:49, Dan Gohman <dan433584 at gmail.com> wrote:
> > Hi Lang,
> >
> > I can reproduce the performance regression on fourinarow, at least. With the
> > patch, the code size and static instruction count of the benchmark's one
> > embarassingly-hot function is lower, the dynamic instruction count is lower,
> > and the stack frame is smaller, but it still runs slower. Instruction
> > selection is basically the same, except that there are fewer cmovs. There
> > appears to be a minor difference in instruction scheduling in the hot
> > function. The regression disappeared when I experimented with non-default
> > values for -pre-RA-sched. However, I'm not prepared for the adventure of
> > changing the instruction scheduler's heuristics at this time, so I'll just
> > let this patch go for now.
> 
> Do you have a small .ll testcase?
> 
> Not handy anymore, but it's just MultiSource/Benchmarks/
> FreeBench/fourinarow/fourinarow with -O3 -flto on x86-64.

fourinarow is jittery, sensitive to register pressure, and doesn’t like codegen changes. Were there several other significant regressions and no significant improvements? Were the results overall bad on non -flto builds too? Or did we just have bad luck with LTO? Are there regressions on any real benchmarks?

Is there any reason to believe this patch is chronically increasing register pressure?

The default SD scheduler should be simply preserving IR order. If the patch fundamentally makes sense, and the generated code before register coalescing looks better by simple metrics: dynamic instruction count and critical path, then the only way forward is to file a bug against the register coalescer and MI scheduler (which are often two sides of the same problem).

I don’t think it’s a good idea to retune these passes to enable unrelated checkins just to make the test-suite numbers look better. That only makes it more difficult to solve the codegen problems in robust ways.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140407/5795ec46/attachment.html>