[PATCH] D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints

Jonas Paulsson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 22 01:19:41 PDT 2017


jonpa updated this revision to Diff 112120.
jonpa added a comment.
Herald added a subscriber: javed.absar.

In https://reviews.llvm.org/D36795#847448, @uweigand wrote:

> In https://reviews.llvm.org/D36795#845610, @jonpa wrote:
>
> > I think that to handle those cases we would have to constrain regclasses somehow after coalescing.
>
>
> This could be done in TRI->updateRegAllocHint, possibly?


Not sure if this is ok (although it should be), but tried it and seemed to improve things a bit.

>> And maybe better than giving hard hints would be to immediately after one out of two GRX32 regs gets allocated constrain the other virtreg.
> 
> Not sure if there is currently any place where this can be done.  Does the register allocator (all of them) even go through registers one-by-one and assigns them, or is the algorithm more complex?

IIRC, I think this is more complex with RegAllocGreedy, since it seems it does not only allocate a physical register once to an interval, but it could also later evict (cancel) that assignment to give that physreg to another interval instead.

>> I am not convinced still that making this guarantee generally is possible (without a target pre-ra pass to do this), especially not for all different kind of register allocators that are around / may appear. It seems that some kind of broader construct is needed in order to always be sure this never goes wrong. Maybe a property of a register class somehow that all operands of any MI must belong to one out of two sub regclasses...? :-/
> 
> Note that this, while appropriate for LOCRMux, would be too restrictive for certain other operations.  For example, for comparisons, we can allow high-high, low-low, and also high-low compares, but not low-high compares, and similarly for add and subtract.  (For comparison, the alternatives / constraints mechanism in GCC allows targets to exactly describe the valid combinations for each instruction, and the register allocator will chose any of those as appropriate.)

It would be nice to be able to do something like that. I suppose if we could constrain regclasses during reg-allocation we could afford to do recursive searches since we are only doing them once (in contrast to giving hints). If we are lucky this would turn out well, although we probably can't "start over" after constraining GRX32, at the point of eviction of an assignment. Well, perhaps that could be added as well. Anyway, this seems not really needed at the moment, but perhaps later if we start to see a lot more GRH32 registers.

I updated the patch and made some more builds of SPEC in order to see the effects of the various ways to tackle this:

  Branch \ Statistic:                     LOCRs_lo        LOCRs_hi     RISBs   "Number of spilled live ranges"
  Master                                  6382            4            225     48939
  LOCRHINTS                               6523            28           60      48947
  LOCRCONSTRAIN                           6429            3            179     48957
  LOCRCONSTRAIN + UPDATEHINT              6464            3            144     48965
  LOCRHINTS + LOCRCONSTRAIN               6558            28           25      48958
  MOREMUX + LOCRCONSTRAIN                 6561		28           22      48959
  HARDHINTS                               6580            27           4       48965
  HARDHINTS + LOCRCONSTRAIN               6581            27           3       48971
  HARDHINTS + LOCRCONSTRAIN + UPDATEHINT  6581            27           3       48974
  HARDHINTS + UPDATEHINT                  6581            27           3       48970
  HARDHINTS + MOREMUX                     6584		27           0       48966   :-)

It seemed that setting the RegClass in updateRegAllocHint() per your suggestion did help a bit, but didn't solve those tricky cases for some reason. I then tried to just do a bit more searching for LOCRMuxes (without recursing), and found that this did actually handle the rest (per last line in table).

Still not sure of all the implications of "hard hints" or constraining regclass in updateRegAllocHint(), just know that "it doesn't crash", and gives "good results" ;-)  Any thoughts, anyone? Quentin?

HARDHINTS are currently looking very promising on preliminary benchmark results. (Have not yet tried MOREMUX, but it should also be good since it very similar).


https://reviews.llvm.org/D36795

Files:
  include/llvm/Target/TargetRegisterInfo.h
  lib/CodeGen/AllocationOrder.cpp
  lib/CodeGen/AllocationOrder.h
  lib/CodeGen/TargetRegisterInfo.cpp
  lib/Target/ARM/ARMBaseRegisterInfo.cpp
  lib/Target/ARM/ARMBaseRegisterInfo.h
  lib/Target/SystemZ/SystemZISelLowering.cpp
  lib/Target/SystemZ/SystemZInstrInfo.cpp
  lib/Target/SystemZ/SystemZRegisterInfo.cpp
  lib/Target/SystemZ/SystemZRegisterInfo.h

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D36795.112120.patch
Type: text/x-patch
Size: 16562 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170822/fe66b698/attachment.bin>


More information about the llvm-commits mailing list