[llvm-commits] [llvm] r82734 - in /llvm/trunk/lib: CodeGen/PrologEpilogInserter.cpp CodeGen/PrologEpilogInserter.h Target/ARM/ARMBaseRegisterInfo.cpp Target/ARM/Thumb1RegisterInfo.cpp

Wed Sep 30 13:45:59 PDT 2009

On Sep 28, 2009, at 11:23 AM, Jakob Stoklund Olesen wrote:
> [snip]
> I think that scavengerRegister should implicitly call FindUnusedReg  
> so that clients don't have to worry about it. I can see a case for  
> still needing a raw FindUnusedReg, yes.
>
> But the needed function is probably really  
> FindUnusedRegThatIsAlsoNotUsedByThisInstructionRange().
>
> Such clever clients must always be able to handle FindUnusedReg  
> returning 0. It looks like the ARMLoadStoreOptimizer handles that  
> correctly.
>

Yep, we're on the same page here.

>>> There is another issue here. FindUnusedReg() will give you a  
>>> register that is unused /at the current position/. You really need  
>>> a register that is unused in the whole live range of the virtual  
>>> register you are replacing. scavengeRegister() won't return a  
>>> register that is used by the instruction at I, but it looks like  
>>> we need a variant that can exclude from whole range of instructions.
>>>
>>> The rule should be: "NewReg must not be defined by any instruction  
>>> in the range, except for the last. NewReg must not be an  
>>> EarlyClobber operand on the last instruction."
>>
>> In the general case, yes, that's true. In the cases we're dealing  
>> with, however, we don't need to be that careful. The register is  
>> defined, then used in the next instruction or two. We're splitting  
>> out a single instruction referencing the frame when the constant  
>> offset is too large. Either we're adjusting the stack point by a  
>> constant value and need a register to hold the constant, or we're  
>> accessing a stack slot and need a register to hold the offset to  
>> the slot. Either way, if the register live range is such that it's  
>> insufficient to check if the register is available at the current  
>> location, then it's a flaw in the way the back end is using this  
>> pass.
>
> An example from the Blackfin backend. Storing a 16-bit subregister  
> to a stack slot:
>
> STORE16fi P0.L, <fi#-1>, 0
>
> With the new vreg code, eliminateFrameIndex will produce:
>
> %reg2000 = LOADuimm16 100
> %reg2000 = ADDpp %reg2000, %FP
> STORE16pi %reg2000, %P0.L
>
> Now we need to replace the vreg with a scavenged register. Assume  
> all P-class registers are used, so we must scavenge one. If  
> scavengeRegister() chooses to scavenge P0, we are dead - the code  
> would become
>
> STORE32pi %FP, emergengy-slot, P0
> %P0 = LOADuimm16 100
> %P0 = ADDpp %P0, %FP
> STORE16pi %P0, %P0.L
> LOAD32pi %P0, %FP, emergency-slot
>
> This disaster won't actually happen because the heuristics in  
> RegScavenger::findSurvivorReg() choose the candidate that is  
> untouched for the longest time. But that is just a heuristic, we  
> shouldn't rely on that.
>
> I think scavengeRegister needs to examine all three instructions  
> (the live range of %reg2000) to avoid such disasters.
>

Thanks for the example, that clarifies things nicely. I agree what  
we're doing now is taking too much advantage of the heuristics of the  
scavenger. Tight coupling is bad and all that.

I'll take a look at expanding the interfaces. I don't think it'll be  
too bad.

> Yes. The performance improvement I am after mostly involves the four  
> BitVectors in RegScavenger::forward(). They currently cost four  
> mallocs and frees per instruction. By swapping forward() and forward 
> (MI), we could make a tighter loop around the core logic, and only  
> allocate BitVectors once per bulk skip.
>

That makes sense. Implementing forward() in terms of foward(MI)  
instead of the other way around seems a clear win.

>> Can you elaborate a bit on why you want to remove forward()? I like  
>> the idea of removing methods that aren't really necessary, but I'm  
>> not sure that's the case here. Are there other motivations?
>
> I want to eliminate redundancy from the API. Right now, you have to  
> call forward(MI) to advance the scavenger to your position. Then you  
> call scavengeRegister(RC, I, 0) where I /must/ be equal to next(MI).  
> It is an error to pass any other iterator, so what is the point of  
> having the argument?
>
> So you can choose to remove the iterator argument from  
> scavengeRegister(), or you can get rid of forward() (as a public  
> method, anyway). I am shooting for forward() because the semantics  
> are already a bit weird. Calling forward(MI) places the scavenger / 
> after/ MI while BuildMI() inserts instructions /before/ the  
> iterator. That is why you are calling forward() at the bottom of you  
> loop, rather than at the top. There are good reasons for this, but  
> it is still weird.
>
> By removing forward(), we push the weirdness into the RegScavenger.
>
> If the choose to retain FindUnusedReg, it would need to take an  
> iterator argument as well.
>
> But given the discussion above, I think that both methods should  
> take two iterator arguments specifying a range of instructions that  
> must not be using the returned register.
>

I agree with all that, with one caveat. I can see usefulness of  
keeping the ability to iterate over a block without always calling  
FindUnusedReg(). A target may have its own mechanisms for scavenging a  
free register, for example, but want to use the standard scavenger for  
tracking liveness.

That is, I agree FindUnusedReg() and scavengeRegister() should call  
forward(MI) directly to enforce their assumptions. I also would like  
to keep forward(MI) as a public interface.

On the gripping hand, perhaps keeping the method but moving it private  
for now is best, and if and when we implement something that needs to  
call forward(MI) outside of these instances, we can move it back to  
public then.

Make sense?

Regards,
-Jim