[llvm-commits] [llvm] r82734 - in /llvm/trunk/lib: CodeGen/PrologEpilogInserter.cpp CodeGen/PrologEpilogInserter.h Target/ARM/ARMBaseRegisterInfo.cpp Target/ARM/Thumb1RegisterInfo.cpp

Wed Sep 30 14:30:02 PDT 2009

I haven't followed the discussion entirely. But there are a couple of  
things I'd like to emphasize.

1. Correctness first, optimization later.
2. The frame offset availability should be tracked separately from  
register liveness. This can be a separate module or a PEI internal  
data structure (for now).

Evan

On Sep 30, 2009, at 1:45 PM, Jim Grosbach wrote:

>
> On Sep 28, 2009, at 11:23 AM, Jakob Stoklund Olesen wrote:
>> [snip]
>> I think that scavengerRegister should implicitly call FindUnusedReg
>> so that clients don't have to worry about it. I can see a case for
>> still needing a raw FindUnusedReg, yes.
>>
>> But the needed function is probably really
>> FindUnusedRegThatIsAlsoNotUsedByThisInstructionRange().
>>
>> Such clever clients must always be able to handle FindUnusedReg
>> returning 0. It looks like the ARMLoadStoreOptimizer handles that
>> correctly.
>>
>
> Yep, we're on the same page here.
>
>>>> There is another issue here. FindUnusedReg() will give you a
>>>> register that is unused /at the current position/. You really need
>>>> a register that is unused in the whole live range of the virtual
>>>> register you are replacing. scavengeRegister() won't return a
>>>> register that is used by the instruction at I, but it looks like
>>>> we need a variant that can exclude from whole range of  
>>>> instructions.
>>>>
>>>> The rule should be: "NewReg must not be defined by any instruction
>>>> in the range, except for the last. NewReg must not be an
>>>> EarlyClobber operand on the last instruction."
>>>
>>> In the general case, yes, that's true. In the cases we're dealing
>>> with, however, we don't need to be that careful. The register is
>>> defined, then used in the next instruction or two. We're splitting
>>> out a single instruction referencing the frame when the constant
>>> offset is too large. Either we're adjusting the stack point by a
>>> constant value and need a register to hold the constant, or we're
>>> accessing a stack slot and need a register to hold the offset to
>>> the slot. Either way, if the register live range is such that it's
>>> insufficient to check if the register is available at the current
>>> location, then it's a flaw in the way the back end is using this
>>> pass.
>>
>> An example from the Blackfin backend. Storing a 16-bit subregister
>> to a stack slot:
>>
>> STORE16fi P0.L, <fi#-1>, 0
>>
>> With the new vreg code, eliminateFrameIndex will produce:
>>
>> %reg2000 = LOADuimm16 100
>> %reg2000 = ADDpp %reg2000, %FP
>> STORE16pi %reg2000, %P0.L
>>
>> Now we need to replace the vreg with a scavenged register. Assume
>> all P-class registers are used, so we must scavenge one. If
>> scavengeRegister() chooses to scavenge P0, we are dead - the code
>> would become
>>
>> STORE32pi %FP, emergengy-slot, P0
>> %P0 = LOADuimm16 100
>> %P0 = ADDpp %P0, %FP
>> STORE16pi %P0, %P0.L
>> LOAD32pi %P0, %FP, emergency-slot
>>
>> This disaster won't actually happen because the heuristics in
>> RegScavenger::findSurvivorReg() choose the candidate that is
>> untouched for the longest time. But that is just a heuristic, we
>> shouldn't rely on that.
>>
>> I think scavengeRegister needs to examine all three instructions
>> (the live range of %reg2000) to avoid such disasters.
>>
>
> Thanks for the example, that clarifies things nicely. I agree what
> we're doing now is taking too much advantage of the heuristics of the
> scavenger. Tight coupling is bad and all that.
>
> I'll take a look at expanding the interfaces. I don't think it'll be
> too bad.
>
>> Yes. The performance improvement I am after mostly involves the four
>> BitVectors in RegScavenger::forward(). They currently cost four
>> mallocs and frees per instruction. By swapping forward() and forward
>> (MI), we could make a tighter loop around the core logic, and only
>> allocate BitVectors once per bulk skip.
>>
>
> That makes sense. Implementing forward() in terms of foward(MI)
> instead of the other way around seems a clear win.
>
>>> Can you elaborate a bit on why you want to remove forward()? I like
>>> the idea of removing methods that aren't really necessary, but I'm
>>> not sure that's the case here. Are there other motivations?
>>
>> I want to eliminate redundancy from the API. Right now, you have to
>> call forward(MI) to advance the scavenger to your position. Then you
>> call scavengeRegister(RC, I, 0) where I /must/ be equal to next(MI).
>> It is an error to pass any other iterator, so what is the point of
>> having the argument?
>>
>> So you can choose to remove the iterator argument from
>> scavengeRegister(), or you can get rid of forward() (as a public
>> method, anyway). I am shooting for forward() because the semantics
>> are already a bit weird. Calling forward(MI) places the scavenger /
>> after/ MI while BuildMI() inserts instructions /before/ the
>> iterator. That is why you are calling forward() at the bottom of you
>> loop, rather than at the top. There are good reasons for this, but
>> it is still weird.
>>
>> By removing forward(), we push the weirdness into the RegScavenger.
>>
>> If the choose to retain FindUnusedReg, it would need to take an
>> iterator argument as well.
>>
>> But given the discussion above, I think that both methods should
>> take two iterator arguments specifying a range of instructions that
>> must not be using the returned register.
>>
>
> I agree with all that, with one caveat. I can see usefulness of
> keeping the ability to iterate over a block without always calling
> FindUnusedReg(). A target may have its own mechanisms for scavenging a
> free register, for example, but want to use the standard scavenger for
> tracking liveness.
>
> That is, I agree FindUnusedReg() and scavengeRegister() should call
> forward(MI) directly to enforce their assumptions. I also would like
> to keep forward(MI) as a public interface.
>
> On the gripping hand, perhaps keeping the method but moving it private
> for now is best, and if and when we implement something that needs to
> call forward(MI) outside of these instances, we can move it back to
> public then.
>
> Make sense?
>
>
> Regards,
> -Jim
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits