[llvm-commits] [llvm] r89187 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/BranchFolding.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h lib/Target/ARM/ARMSubtarget.cpp lib/Target/ARM/ARMSubtarget.h

Evan Cheng evan.cheng at apple.com
Wed Nov 18 11:01:10 PST 2009


On Nov 18, 2009, at 9:29 AM, Bob Wilson wrote:

> The only disadvantage of what you suggest is code size.  There is  
> currently no limit on the number of predecessors where a block may be  
> duplicated, so the effect of duplicating a small block can be  
> magnified in the overall code size.  (And, at least for duplicating  
> indirect branches on ARM Cortex processors, we don't want to limit  
> that.)  On a small low-power device without sophisticated branch  
> prediction, where code size typically matters more than usual, we'll  
> be doing the wrong thing.  But, indirect branches are not very common,  
> so it probably doesn't really matter so much.
> 
> Most modern high-performance processors will benefit from tail  
> duplicating indirect branches, so that would be another reason to  
> avoid the target hook.

The "cost" of branches vary on different targets though. We really need to be sure before we eliminate the target hooks.

Evan

> 
> Unless someone else has another idea, I'll get rid of the tail  
> duplication target hook.  As you mention, we'll need a way to identify  
> indirect branches.  I'd prefer to add a new IsIndirectBranch target  
> hook.  This goes against your desire to avoid new target hooks, but  
> it's nice and simple.  Alternatively, we could add another bool  
> reference argument to AnalyzeBranch.  When AnalyzeBranch returns true,  
> which currently means that it cannot understand the branch, it would  
> set the new argument to indicate whether it is an indirect branch.   
> That doesn't seem to fit very well with the AnalyzeBranch interface,  
> which is already pretty complicated, so I don't like that so much.
> 
> On Nov 18, 2009, at 8:39 AM, Chris Lattner wrote:
> 
>> I understand that this is important for performance, but why isn't  
>> this a win (or at least not a loss) for all targets?  Would it be  
>> reasonable to just increase the threshold (in the taildup code  
>> itself) for unconditional branches to blocks that end with an  
>> indirect jump regardless of CPU and whether the jump came from a  
>> switch or indirect goto?
>> 
>> I guess what I'm getting at is that I'd prefer to not add yet- 
>> another target hook (particularly one where targets return an  
>> arbitrary and unitless value), and just make analyze branch smarter  
>> if needbe.  If the goal is to taildup indirect branches, it would be  
>> better to look for that target independent structure rather than  
>> foist the issue onto target authors.
>> 
>> -Chris
>> 
>> On Nov 18, 2009, at 5:34 AM, Bob Wilson <bob.wilson at apple.com> wrote:
>> 
>>> Author: bwilson
>>> Date: Tue Nov 17 21:34:27 2009
>>> New Revision: 89187
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=89187&view=rev
>>> Log:
>>> Add a target hook to allow changing the tail duplication limit  
>>> based on the
>>> contents of the block to be duplicated.  Use this for ARM Cortex  
>>> A8/9 to
>>> be more aggressive tail duplicating indirect branches, since it  
>>> makes it
>>> much more likely that they will be predicted in the branch target  
>>> buffer.
>>> Testcase coming soon.
>>> 
>>> Modified:
>>>  llvm/trunk/include/llvm/Target/TargetInstrInfo.h
>>>  llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>>  llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp
>>>  llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h
>>>  llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp
>>>  llvm/trunk/lib/Target/ARM/ARMSubtarget.h
>>> 
>>> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=89187&r1=89186&r2=89187&view=diff
>>> 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> =====================================================================
>>> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
>>> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Nov 17  
>>> 21:34:27 2009
>>> @@ -536,6 +536,13 @@
>>> /// length.
>>> virtual unsigned getInlineAsmLength(const char *Str,
>>>                                     const MCAsmInfo &MAI) const;
>>> +
>>> +  /// TailDuplicationLimit - Returns the limit on the number of  
>>> instructions
>>> +  /// in basic block MBB beyond which it will not be tail- 
>>> duplicated.
>>> +  virtual unsigned TailDuplicationLimit(const MachineBasicBlock  
>>> &MBB,
>>> +                                        unsigned DefaultLimit)  
>>> const {
>>> +    return DefaultLimit;
>>> +  }
>>> };
>>> 
>>> /// TargetInstrInfoImpl - This is the default implementation of
>>> 
>>> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89187&r1=89186&r2=89187&view=diff
>>> 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> =====================================================================
>>> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Nov 17 21:34:27 2009
>>> @@ -1033,12 +1033,13 @@
>>> if (TailBB->isSuccessor(TailBB))
>>>   return false;
>>> 
>>> -  // Duplicate up to one less than the tail-merge threshold. When  
>>> optimizing
>>> -  // for size, duplicate only one, because one branch instruction  
>>> can be
>>> -  // eliminated to compensate for the duplication.
>>> +  // Set the limit on the number of instructions to duplicate,  
>>> with a default
>>> +  // of one less than the tail-merge threshold. When optimizing  
>>> for size,
>>> +  // duplicate only one, because one branch instruction can be  
>>> eliminated to
>>> +  // compensate for the duplication.
>>> unsigned MaxDuplicateCount =
>>>   MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize) ?
>>> -      1 : (TailMergeSize - 1);
>>> +    1 : TII->TailDuplicationLimit(*TailBB, TailMergeSize - 1);
>>> 
>>> // Check the instructions in the block to determine whether tail- 
>>> duplication
>>> // is invalid or unlikely to be profitable.
>>> 
>>> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89187&r1=89186&r2=89187&view=diff
>>> 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> =====================================================================
>>> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original)
>>> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 17  
>>> 21:34:27 2009
>>> @@ -1005,6 +1005,16 @@
>>> return TargetInstrInfoImpl::isIdentical(MI0, MI1, MRI);
>>> }
>>> 
>>> +unsigned ARMBaseInstrInfo::TailDuplicationLimit(const  
>>> MachineBasicBlock &MBB,
>>> +                                                unsigned  
>>> DefaultLimit) const {
>>> +  // If the target processor can predict indirect branches, it is  
>>> highly
>>> +  // desirable to duplicate them, since it can often make them  
>>> predictable.
>>> +  if (!MBB.empty() && isIndirectBranchOpcode(MBB.back().getOpcode 
>>> ()) &&
>>> +      getSubtarget().hasBranchTargetBuffer())
>>> +    return DefaultLimit + 2;
>>> +  return DefaultLimit;
>>> +}
>>> +
>>> /// getInstrPredicate - If instruction is predicated, returns its  
>>> predicate
>>> /// condition, otherwise returns AL. It also returns the condition  
>>> code
>>> /// register by reference.
>>> 
>>> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89187&r1=89186&r2=89187&view=diff
>>> 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> =====================================================================
>>> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original)
>>> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Tue Nov 17  
>>> 21:34:27 2009
>>> @@ -272,6 +272,9 @@
>>> 
>>> virtual bool isIdentical(const MachineInstr *MI, const  
>>> MachineInstr *Other,
>>>                          const MachineRegisterInfo *MRI) const;
>>> +
>>> +  virtual unsigned TailDuplicationLimit(const MachineBasicBlock  
>>> &MBB,
>>> +                                        unsigned DefaultLimit)  
>>> const;
>>> };
>>> 
>>> static inline
>>> 
>>> Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp?rev=89187&r1=89186&r2=89187&view=diff
>>> 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> =====================================================================
>>> --- llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp (original)
>>> +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp Tue Nov 17 21:34:27  
>>> 2009
>>> @@ -109,6 +109,8 @@
>>>   if (UseNEONFP.getPosition() == 0)
>>>     UseNEONForSinglePrecisionFP = true;
>>> }
>>> +  HasBranchTargetBuffer = (CPUString == "cortex-a8" ||
>>> +                           CPUString == "cortex-a9");
>>> }
>>> 
>>> /// GVIsIndirectSymbol - true if the GV will be accessed via an  
>>> indirect symbol.
>>> 
>>> Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.h
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.h?rev=89187&r1=89186&r2=89187&view=diff
>>> 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> =====================================================================
>>> --- llvm/trunk/lib/Target/ARM/ARMSubtarget.h (original)
>>> +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.h Tue Nov 17 21:34:27 2009
>>> @@ -50,6 +50,9 @@
>>> /// determine if NEON should actually be used.
>>> bool UseNEONForSinglePrecisionFP;
>>> 
>>> +  /// HasBranchTargetBuffer - True if processor can predict  
>>> indirect branches.
>>> +  bool HasBranchTargetBuffer;
>>> +
>>> /// IsThumb - True if we are in thumb mode, false if in ARM mode.
>>> bool IsThumb;
>>> 
>>> @@ -123,6 +126,8 @@
>>> bool isThumb2() const { return IsThumb && (ThumbMode == Thumb2); }
>>> bool hasThumb2() const { return ThumbMode >= Thumb2; }
>>> 
>>> +  bool hasBranchTargetBuffer() const { return  
>>> HasBranchTargetBuffer; }
>>> +
>>> bool isR9Reserved() const { return IsR9Reserved; }
>>> 
>>> const std::string & getCPUString() const { return CPUString; }
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list