[LLVMdev] Register pressure mechanism in PRE or Smarter rematerialization/split/spiller/coalescing ?

Wed Jul 15 07:48:06 PDT 2015

On Tue, Jul 14, 2015 at 11:43 PM, Lawrence <lawrence at codeaurora.org> wrote:
> I thought about a little bit more, I think adding Register pressure control in your patch or PRE may be the only choice.
>
> Because at least for this case I am looking at,  what your patch did is created more relatively complex long live range, rematerialization is not smart enough to undo your change or at least without a lot of work, coalescing only create even longer live range not shorter, Spiller can't help since it's the Spiller created Spill/Reloads due to high register pressure, Splitting can shorten the live ranges, but I don't think it can handle your case without a lot of work.
>

1. As I mentioned, it simply fixes a bug in implementation of one of
the two PRE's LLVM has.  It does not  change the PRE algorithm or add
anything to it.  The code had a bug. I fixed the bug :P.    PRE is
*not even adding more code in this case*.   The code is already there.
  All it is doing is inserting a phi node.  If you transformed your
code to use memory, and reverted my patch, you'd get the same result,
because Load PRE will do the same thing. It's what PRE does.

2. GCC and other compilers have PRE's literally the same thing my
patch does (you are welcome to verify, i wrote GCC's :P), and
apparently are smart enough to handle this in RA.  So i'm going to
suggest that it is, in fact, possible to do so, and i'm going to
further suggest that if we want to match their performance, we need to
be able to do the same.  You can't simply "turn down" any optimization
that RA may have to deal with.  It usually doesn't work in practice.
This is one of the reasons good RA is so hard.

3. As I also mentioned, register pressure heuristics in PRE simply do
not work.  They've been tried.  By many.  With little to no success.

PRE is too high in the stack of optimizations to estimate register
pressure in any sane fashion.   It's pretty much a fools errand.  You
can never tune it to do what you want.  *Many* have tried.

Your base level problem here is that all modern PRE algorithms (except
for min-cut PRE, as I mentioned), are based on a notion of lifetime
optimality. That is, they extend lifetimes as minimally as possible to
still eliminate a given redundancy. Ours does the same.

However, this is not an entirely useful metric.  Optimizing for some
other metric is what something like min-cut PRE lets you do.
But even then,  register pressure heuristics are almost certainly not
the answer.

4. This was actually already discussed when the patch was submitted,
and the consensus was "we should just fix RA".  Feel free to look at
the discussion 5 months ago.

I would suggest, if you want to fix this, you take the approach that
was discussed then.