[PATCH] D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs

Ulrich Weigand via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 25 05:30:37 PDT 2017


uweigand added a comment.

In https://reviews.llvm.org/D37899#879892, @jonpa wrote:

> Patch updated.
>
> > I looked into making i128 legal when I recently added the 128-bit atomics, but in the end decided against it. Not only would it be a lot of work (since you basically would have to reimplement all the operations on i128 that are currently handled by common code), but it is not clear that we actually always want it.
>
> I thought one could get away with adding calling setOperationAction(..., i128, Expand) on all operations that we do not support...


It turns out this doesn't work.  E.g. in the case of a 128-bit add, this is split into a sequence of 64-bit add-with-carry operations by the *type legalization* code, not the normal expander.  If you declare i128 a legal type, the type legalizer doesn't do this any more.  But then if you mark the ADD as Expand, the normal expander doesn't know what to do with it either, and -I think- just falls back to a libcall.

>> i128 lives in a even/odd register pair, but this is really only needed for the few special instructions that operate on those pairs. If you had just a normal load followed by a normal store of a i128 value, you *want* to break it into two independent i64 copies -- there is no reason at all to constrain the register allocator to use a even/odd pair here.
> 
> I would have expected that for a tiny live range, it is always better to avoid extra COPYs.

My point was for a 128-bit load feeding into a 128-bit store, both the load and the store gets expanded to two 64-bit loads / stores anyway.   So we end up with two completely independent 64-bit data flow paths, which require 64-bit registers --- but there is absolutely no benefit to require that these registers form an even/odd pair, they might as well be arbitrary registers (it might even be the same one if the sequence ends up scheduled as LG / STG / LG / STG).  Actually the same applies to most of the other cases where the 128-bit operation ends up split into 64-bit operations by the type legalizer.

> The updated patch uses LiveIntervals to quickly check if the two connected vregs are local to MBB and what the first and last instructions are (it may be possible to do this without LIS if we are willing to scan the whole MBB). I merely tried some values of 10 instructions and 8 phys-reg operands in order to detect a non-safe live-range. It may be that we want to check for vreg operands as well.
> 
> I don't have adequate tests to tune this further, but this might hopefully be a good starting point. At least all the regressions we saw before are now gone, while still handling your test case.

This looks promising.   Not sure if anybody would object to changing the interface to pass in the LiveIntervals, but it seems reasonable to me.

Not sure about the magic numbers -- does it matter whether we stop after 10 MIs?   What kind of compile-time performance impact would it have to scan the whole range?

It looks like your check for > 8 occurrences of physical registers would trigger even if it is always the same register.  Should we instead keep a bitmap of the used registers, so we know how many *different* ones are used?  Also, it should probably only check for GPRs, not any random physical register.


https://reviews.llvm.org/D37899





More information about the llvm-commits mailing list