[llvm-commits] [llvm] r89187 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/BranchFolding.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h lib/Target/ARM/ARMSubtarget.cpp lib/Target/ARM/ARMSubtarget.h
Bob Wilson
bob.wilson at apple.com
Wed Nov 18 09:29:42 PST 2009
The only disadvantage of what you suggest is code size. There is
currently no limit on the number of predecessors where a block may be
duplicated, so the effect of duplicating a small block can be
magnified in the overall code size. (And, at least for duplicating
indirect branches on ARM Cortex processors, we don't want to limit
that.) On a small low-power device without sophisticated branch
prediction, where code size typically matters more than usual, we'll
be doing the wrong thing. But, indirect branches are not very common,
so it probably doesn't really matter so much.
Most modern high-performance processors will benefit from tail
duplicating indirect branches, so that would be another reason to
avoid the target hook.
Unless someone else has another idea, I'll get rid of the tail
duplication target hook. As you mention, we'll need a way to identify
indirect branches. I'd prefer to add a new IsIndirectBranch target
hook. This goes against your desire to avoid new target hooks, but
it's nice and simple. Alternatively, we could add another bool
reference argument to AnalyzeBranch. When AnalyzeBranch returns true,
which currently means that it cannot understand the branch, it would
set the new argument to indicate whether it is an indirect branch.
That doesn't seem to fit very well with the AnalyzeBranch interface,
which is already pretty complicated, so I don't like that so much.
On Nov 18, 2009, at 8:39 AM, Chris Lattner wrote:
> I understand that this is important for performance, but why isn't
> this a win (or at least not a loss) for all targets? Would it be
> reasonable to just increase the threshold (in the taildup code
> itself) for unconditional branches to blocks that end with an
> indirect jump regardless of CPU and whether the jump came from a
> switch or indirect goto?
>
> I guess what I'm getting at is that I'd prefer to not add yet-
> another target hook (particularly one where targets return an
> arbitrary and unitless value), and just make analyze branch smarter
> if needbe. If the goal is to taildup indirect branches, it would be
> better to look for that target independent structure rather than
> foist the issue onto target authors.
>
> -Chris
>
> On Nov 18, 2009, at 5:34 AM, Bob Wilson <bob.wilson at apple.com> wrote:
>
>> Author: bwilson
>> Date: Tue Nov 17 21:34:27 2009
>> New Revision: 89187
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=89187&view=rev
>> Log:
>> Add a target hook to allow changing the tail duplication limit
>> based on the
>> contents of the block to be duplicated. Use this for ARM Cortex
>> A8/9 to
>> be more aggressive tail duplicating indirect branches, since it
>> makes it
>> much more likely that they will be predicted in the branch target
>> buffer.
>> Testcase coming soon.
>>
>> Modified:
>> llvm/trunk/include/llvm/Target/TargetInstrInfo.h
>> llvm/trunk/lib/CodeGen/BranchFolding.cpp
>> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp
>> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h
>> llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp
>> llvm/trunk/lib/Target/ARM/ARMSubtarget.h
>>
>> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=89187&r1=89186&r2=89187&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
>> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Nov 17
>> 21:34:27 2009
>> @@ -536,6 +536,13 @@
>> /// length.
>> virtual unsigned getInlineAsmLength(const char *Str,
>> const MCAsmInfo &MAI) const;
>> +
>> + /// TailDuplicationLimit - Returns the limit on the number of
>> instructions
>> + /// in basic block MBB beyond which it will not be tail-
>> duplicated.
>> + virtual unsigned TailDuplicationLimit(const MachineBasicBlock
>> &MBB,
>> + unsigned DefaultLimit)
>> const {
>> + return DefaultLimit;
>> + }
>> };
>>
>> /// TargetInstrInfoImpl - This is the default implementation of
>>
>> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89187&r1=89186&r2=89187&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Nov 17 21:34:27 2009
>> @@ -1033,12 +1033,13 @@
>> if (TailBB->isSuccessor(TailBB))
>> return false;
>>
>> - // Duplicate up to one less than the tail-merge threshold. When
>> optimizing
>> - // for size, duplicate only one, because one branch instruction
>> can be
>> - // eliminated to compensate for the duplication.
>> + // Set the limit on the number of instructions to duplicate,
>> with a default
>> + // of one less than the tail-merge threshold. When optimizing
>> for size,
>> + // duplicate only one, because one branch instruction can be
>> eliminated to
>> + // compensate for the duplication.
>> unsigned MaxDuplicateCount =
>> MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize) ?
>> - 1 : (TailMergeSize - 1);
>> + 1 : TII->TailDuplicationLimit(*TailBB, TailMergeSize - 1);
>>
>> // Check the instructions in the block to determine whether tail-
>> duplication
>> // is invalid or unlikely to be profitable.
>>
>> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89187&r1=89186&r2=89187&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original)
>> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 17
>> 21:34:27 2009
>> @@ -1005,6 +1005,16 @@
>> return TargetInstrInfoImpl::isIdentical(MI0, MI1, MRI);
>> }
>>
>> +unsigned ARMBaseInstrInfo::TailDuplicationLimit(const
>> MachineBasicBlock &MBB,
>> + unsigned
>> DefaultLimit) const {
>> + // If the target processor can predict indirect branches, it is
>> highly
>> + // desirable to duplicate them, since it can often make them
>> predictable.
>> + if (!MBB.empty() && isIndirectBranchOpcode(MBB.back().getOpcode
>> ()) &&
>> + getSubtarget().hasBranchTargetBuffer())
>> + return DefaultLimit + 2;
>> + return DefaultLimit;
>> +}
>> +
>> /// getInstrPredicate - If instruction is predicated, returns its
>> predicate
>> /// condition, otherwise returns AL. It also returns the condition
>> code
>> /// register by reference.
>>
>> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89187&r1=89186&r2=89187&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original)
>> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Tue Nov 17
>> 21:34:27 2009
>> @@ -272,6 +272,9 @@
>>
>> virtual bool isIdentical(const MachineInstr *MI, const
>> MachineInstr *Other,
>> const MachineRegisterInfo *MRI) const;
>> +
>> + virtual unsigned TailDuplicationLimit(const MachineBasicBlock
>> &MBB,
>> + unsigned DefaultLimit)
>> const;
>> };
>>
>> static inline
>>
>> Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp?rev=89187&r1=89186&r2=89187&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp (original)
>> +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp Tue Nov 17 21:34:27
>> 2009
>> @@ -109,6 +109,8 @@
>> if (UseNEONFP.getPosition() == 0)
>> UseNEONForSinglePrecisionFP = true;
>> }
>> + HasBranchTargetBuffer = (CPUString == "cortex-a8" ||
>> + CPUString == "cortex-a9");
>> }
>>
>> /// GVIsIndirectSymbol - true if the GV will be accessed via an
>> indirect symbol.
>>
>> Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.h
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.h?rev=89187&r1=89186&r2=89187&view=diff
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- llvm/trunk/lib/Target/ARM/ARMSubtarget.h (original)
>> +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.h Tue Nov 17 21:34:27 2009
>> @@ -50,6 +50,9 @@
>> /// determine if NEON should actually be used.
>> bool UseNEONForSinglePrecisionFP;
>>
>> + /// HasBranchTargetBuffer - True if processor can predict
>> indirect branches.
>> + bool HasBranchTargetBuffer;
>> +
>> /// IsThumb - True if we are in thumb mode, false if in ARM mode.
>> bool IsThumb;
>>
>> @@ -123,6 +126,8 @@
>> bool isThumb2() const { return IsThumb && (ThumbMode == Thumb2); }
>> bool hasThumb2() const { return ThumbMode >= Thumb2; }
>>
>> + bool hasBranchTargetBuffer() const { return
>> HasBranchTargetBuffer; }
>> +
>> bool isR9Reserved() const { return IsR9Reserved; }
>>
>> const std::string & getCPUString() const { return CPUString; }
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list