[llvm-commits] [llvm] r82734 - in /llvm/trunk/lib: CodeGen/PrologEpilogInserter.cpp CodeGen/PrologEpilogInserter.h Target/ARM/ARMBaseRegisterInfo.cpp Target/ARM/Thumb1RegisterInfo.cpp

Jim Grosbach grosbach at apple.com
Mon Sep 28 09:25:17 PDT 2009


Hi Jakob,

Thanks for the great feedback. I'll incorporate some changes today now  
that I'm back in the office (was sick over the weekend). Comments below.

On Sep 25, 2009, at 12:47 AM, Jakob Stoklund Olesen wrote:

>
> On 25/09/2009, at 01.52, Jim Grosbach wrote:
>> Start of revamping the register scavenging in PEI. ARM Thumb1 is  
>> the driving
>> interest for this, as it currently reserves a register rather than  
>> using
>> the scavenger for matierializing constants as needed.
>
> Great! Please see comments below.
>
>> +void PEI::scavengeFrameVirtualRegs(MachineFunction &Fn) {
> ...
>> +    // Keep a map of which scratch reg we use for each virtual reg.
>> +    // FIXME: Is a map like this the best solution? Seems like  
>> overkill,
>> +    // but to get rid of it would need some fairly strong  
>> assumptions
>> +    // that may not be valid as this gets smarter about reuse and  
>> such.
>> +    IndexedMap<unsigned, VirtReg2IndexFunctor> ScratchRegForVirtReg;
>
> It seems that for now, we must require that live ranges of virtual  
> registers are disjoint. Otherwise we would need more than one  
> emergency spill slot in the worst case.
>
> With only one virtual register live at a time, the map is indeed  
> overkill.
>
> It would also be a good idea to enforce the one-virtual-register-at- 
> a-time rule with asserts here.
>

Agreed to both. As indicated by the comment, I was thinking towards  
doing something fancier when we start to re-use these. Allowing more  
than one to be live so long as the emergency spill slot is unused, or  
something like that. Thus, I didn't want to encode too many strong  
assumption in place that would make that harder. Upon reflection,  
however, I don't think that's a direction we should go. If we want the  
compiler to be that much smarter about reusing the constants, we  
should work towards letting the real register allocator do the work.  
This pass should be a quick and simple one to handle obvious cases and  
shouldn't try to get too smart.

>> +    ScratchRegForVirtReg.grow(Fn.getRegInfo().getLastVirtReg());
>> +
>> +    for (MachineBasicBlock::iterator I = BB->begin(); I != BB->end 
>> (); ++I) {
>> +      MachineInstr *MI = I;
>> +      for (unsigned i = 0, e = MI->getNumOperands(); i != e; ++i)
>> +        if (MI->getOperand(i).isReg()) {
>> +          unsigned Reg = MI->getOperand(i).getReg();
>> +          if (Reg && TRI->isVirtualRegister(Reg)) {
>
> Would it perhaps be possible to reduce nesting with 'continue' here?

Can't see why not.

>
>> +            // If we already have a scratch for this virtual  
>> register, use it
>> +            unsigned NewReg = ScratchRegForVirtReg[Reg];
>> +            if (!NewReg) {
>> +              const TargetRegisterClass *RC = Fn.getRegInfo 
>> ().getRegClass(Reg);
>> +              NewReg = RS->FindUnusedReg(RC);
>> +              if (NewReg == 0)
>> +                // No register is "free". Scavenge a register.
>> +                // FIXME: Track SPAdj. Zero won't always be right
>> +                NewReg = RS->scavengeRegister(RC, I, 0);
>
> Ideally, I would like to get rid of FindUnusedReg(), so you could  
> just say scavengeRegister() here. The ARMLoadStoreOptimizer is the  
> only one to use FindUnusedReg() without immediately calling  
> scavengerRegister() when it fails. I am no sure if that is  
> intentional.

I've been thinking about that, too. I can see the benefit in asking it  
for a register, but only if a spill isn't necessary. For example, when  
loading a constant, materializing the constant directly may be more  
expensive than a load, but still cheaper if spills are necessary.  
Nothing is being that smart currently, though. In practice, it looks  
like one could call scavengeRegister() directly without first calling  
FindUnusedReg() and get the single-call interface you're looking for.

>
> There is another issue here. FindUnusedReg() will give you a  
> register that is unused /at the current position/. You really need a  
> register that is unused in the whole live range of the virtual  
> register you are replacing. scavengeRegister() won't return a  
> register that is used by the instruction at I, but it looks like we  
> need a variant that can exclude from whole range of instructions.
>
> The rule should be: "NewReg must not be defined by any instruction  
> in the range, except for the last. NewReg must not be an  
> EarlyClobber operand on the last instruction."

In the general case, yes, that's true. In the cases we're dealing  
with, however, we don't need to be that careful. The register is  
defined, then used in the next instruction or two. We're splitting out  
a single instruction referencing the frame when the constant offset is  
too large. Either we're adjusting the stack point by a constant value  
and need a register to hold the constant, or we're accessing a stack  
slot and need a register to hold the offset to the slot. Either way,  
if the register live range is such that it's insufficient to check if  
the register is available at the current location, then it's a flaw in  
the way the back end is using this pass.

That said, that's an assumption worth enforcing. Perhaps in the loop,  
if we have an active scavenged register, we should add an assert()  
that the current instruction does not reference it. The reference  
we're looking for will still be a virtual register, so it won't fire  
on that. Hmmm.

>> +              assert (NewReg && "unable to scavenge register!");
>> +              ScratchRegForVirtReg[Reg] = NewReg;
>> +            }
>> +            MI->getOperand(i).setReg(NewReg);
>
> For full generality, you need to watch out for subreg operands when  
> replacing a virtreg with a physreg. At least you should assert 
> (getSubReg()==0).

Yeah, adding an assert is a good idea. We shouldn't be seeing any of  
that sort of thing, and it's best to verify that when possible.

>> +          }
>> +        }
>> +      RS->forward(MI);
>
> Yeah, I also want to get rid of forward() as a public method. Since  
> scavengeRegister takes an iterator argument that /must/ be next(MI)  
> anyway, there is really no point to having it.

I'm a bit leery of merging it with scavengeRegister(). It feels more  
appropriate to keep those bits of functionality separate since they're  
logically distinct. I can envision wanting to do things like tracking  
to a point in the block and only conditionally scavenging a register  
before proceeding.
>
> Batch forwarding over a range of instructions is going to be faster  
> as well.

forward(MachineBasicBlock::iterator I) just loops over calls to forward 
() currently. Any performance improvements available from merging the  
forwarding into scavengeRegister() should be equally applicable as  
improvements to the current implementation, I would think.

Can you elaborate a bit on why you want to remove forward()? I like  
the idea of removing methods that aren't really necessary, but I'm not  
sure that's the case here. Are there other motivations?

Regards,
-Jim



More information about the llvm-commits mailing list