[PATCH] D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs

Ulrich Weigand via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 25 08:16:38 PDT 2017


uweigand added a comment.

In https://reviews.llvm.org/D37899#880029, @jonpa wrote:

> > Not sure about the magic numbers -- does it matter whether we stop after 10 MIs?   What kind of compile-time performance impact would it have to scan the whole range?
>
> I don't know, I just thought that we basically wanted to at least handle the type of case where there is a GR128 op and then some subreg COPY(s) very close. My reason for having the instruction limit is that we can't know the register pressure, but it seems safe to think that tiny live ranges checked in this way could be coalesced.


Ah, I misread the code -- if we have more than 10 instructions, you default to *not* coalese (I somehow thought it was the opposite).  That seems more reasonable.  The interesting question will then be how this interacts with the scheduler, i.e. how frequently it happens that unrelated code is scheduled in between, and thus breaking the coalescing.  Not sure if that is a real concern ...

>> It looks like your check for > 8 occurrences of physical registers would trigger even if it is always the same register.  Should we instead keep a bitmap of the used registers, so we know how many *different* ones are used?  Also, it should probably only check for GPRs, not any random physical register.
> 
> Checking for GPRs is something I seem to have missed, but however the different reg check is something I actually took away after first using a SmallSet. I thought that if we would to this properly, we should in addition also check aliases after finding the set of different regs. I then changed my mind since we are merely trying to do an estimate, and we don't really know for sure what the right limit is, etc and may rather just keep it simple (BTW, I also see that it should do +=2 for a GR128 reg).

Yes, that's why I was thinking of a simple bitmap of the 16 actual GPRs, and just set a bit if this GPR is touched (no matter via which super/subreg or alias).

> Taking this a step further, if regalloc can only run out of registers when enough physreg defs overlap the GR128 live range so that there is no single 128-bit register pair to be allocated, then we should basically allow coalescing if there is at least one non-clobbered i128 register pair in the region, right?
> 
> Even if there were many such GR128 virtregs overlapping, one would hope that regalloc would be able to split them so that one such free pair would be enough. Would we want to stop coalescing at that some point in order to reduce spilling? We could perhaps try to by looking for GR128 virtual register operands in the region.

Strictly speaking, yes.  But I think it also doesn't matter to avoid coalescing even if the register pressure isn't *that* extreme -- it is true that as long as at least one GPR pair is available, the allocator should be able to fix up anything via spilling, but in general spilling is more expensive than extra copying due to non-coalescing would have been ...

(B.t.w. if you do go for the one-pair check, it needs to check for one pair except 0/1, because 0/1 isn't usable for ADDR128.)


https://reviews.llvm.org/D37899





More information about the llvm-commits mailing list