[llvm-commits] [llvm] r89187 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/BranchFolding.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h lib/Target/ARM/ARMSubtarget.cpp lib/Target/ARM/ARMSubtarget.h
Chris Lattner
clattner at apple.com
Wed Nov 18 08:39:26 PST 2009
I understand that this is important for performance, but why isn't
this a win (or at least not a loss) for all targets? Would it be
reasonable to just increase the threshold (in the taildup code itself)
for unconditional branches to blocks that end with an indirect jump
regardless of CPU and whether the jump came from a switch or indirect
goto?
I guess what I'm getting at is that I'd prefer to not add yet-another
target hook (particularly one where targets return an arbitrary and
unitless value), and just make analyze branch smarter if needbe. If
the goal is to taildup indirect branches, it would be better to look
for that target independent structure rather than foist the issue onto
target authors.
-Chris
On Nov 18, 2009, at 5:34 AM, Bob Wilson <bob.wilson at apple.com> wrote:
> Author: bwilson
> Date: Tue Nov 17 21:34:27 2009
> New Revision: 89187
>
> URL: http://llvm.org/viewvc/llvm-project?rev=89187&view=rev
> Log:
> Add a target hook to allow changing the tail duplication limit based
> on the
> contents of the block to be duplicated. Use this for ARM Cortex
> A8/9 to
> be more aggressive tail duplicating indirect branches, since it
> makes it
> much more likely that they will be predicted in the branch target
> buffer.
> Testcase coming soon.
>
> Modified:
> llvm/trunk/include/llvm/Target/TargetInstrInfo.h
> llvm/trunk/lib/CodeGen/BranchFolding.cpp
> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp
> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h
> llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp
> llvm/trunk/lib/Target/ARM/ARMSubtarget.h
>
> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=89187&r1=89186&r2=89187&view=diff
>
> ===
> ===
> ===
> =====================================================================
> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Nov 17
> 21:34:27 2009
> @@ -536,6 +536,13 @@
> /// length.
> virtual unsigned getInlineAsmLength(const char *Str,
> const MCAsmInfo &MAI) const;
> +
> + /// TailDuplicationLimit - Returns the limit on the number of
> instructions
> + /// in basic block MBB beyond which it will not be tail-duplicated.
> + virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB,
> + unsigned DefaultLimit)
> const {
> + return DefaultLimit;
> + }
> };
>
> /// TargetInstrInfoImpl - This is the default implementation of
>
> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89187&r1=89186&r2=89187&view=diff
>
> ===
> ===
> ===
> =====================================================================
> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original)
> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Nov 17 21:34:27 2009
> @@ -1033,12 +1033,13 @@
> if (TailBB->isSuccessor(TailBB))
> return false;
>
> - // Duplicate up to one less than the tail-merge threshold. When
> optimizing
> - // for size, duplicate only one, because one branch instruction
> can be
> - // eliminated to compensate for the duplication.
> + // Set the limit on the number of instructions to duplicate, with
> a default
> + // of one less than the tail-merge threshold. When optimizing for
> size,
> + // duplicate only one, because one branch instruction can be
> eliminated to
> + // compensate for the duplication.
> unsigned MaxDuplicateCount =
> MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize) ?
> - 1 : (TailMergeSize - 1);
> + 1 : TII->TailDuplicationLimit(*TailBB, TailMergeSize - 1);
>
> // Check the instructions in the block to determine whether tail-
> duplication
> // is invalid or unlikely to be profitable.
>
> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89187&r1=89186&r2=89187&view=diff
>
> ===
> ===
> ===
> =====================================================================
> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original)
> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 17
> 21:34:27 2009
> @@ -1005,6 +1005,16 @@
> return TargetInstrInfoImpl::isIdentical(MI0, MI1, MRI);
> }
>
> +unsigned ARMBaseInstrInfo::TailDuplicationLimit(const
> MachineBasicBlock &MBB,
> + unsigned
> DefaultLimit) const {
> + // If the target processor can predict indirect branches, it is
> highly
> + // desirable to duplicate them, since it can often make them
> predictable.
> + if (!MBB.empty() && isIndirectBranchOpcode(MBB.back().getOpcode
> ()) &&
> + getSubtarget().hasBranchTargetBuffer())
> + return DefaultLimit + 2;
> + return DefaultLimit;
> +}
> +
> /// getInstrPredicate - If instruction is predicated, returns its
> predicate
> /// condition, otherwise returns AL. It also returns the condition
> code
> /// register by reference.
>
> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89187&r1=89186&r2=89187&view=diff
>
> ===
> ===
> ===
> =====================================================================
> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original)
> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Tue Nov 17 21:34:27
> 2009
> @@ -272,6 +272,9 @@
>
> virtual bool isIdentical(const MachineInstr *MI, const
> MachineInstr *Other,
> const MachineRegisterInfo *MRI) const;
> +
> + virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB,
> + unsigned DefaultLimit) const;
> };
>
> static inline
>
> Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp?rev=89187&r1=89186&r2=89187&view=diff
>
> ===
> ===
> ===
> =====================================================================
> --- llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp (original)
> +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp Tue Nov 17 21:34:27
> 2009
> @@ -109,6 +109,8 @@
> if (UseNEONFP.getPosition() == 0)
> UseNEONForSinglePrecisionFP = true;
> }
> + HasBranchTargetBuffer = (CPUString == "cortex-a8" ||
> + CPUString == "cortex-a9");
> }
>
> /// GVIsIndirectSymbol - true if the GV will be accessed via an
> indirect symbol.
>
> Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.h?rev=89187&r1=89186&r2=89187&view=diff
>
> ===
> ===
> ===
> =====================================================================
> --- llvm/trunk/lib/Target/ARM/ARMSubtarget.h (original)
> +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.h Tue Nov 17 21:34:27 2009
> @@ -50,6 +50,9 @@
> /// determine if NEON should actually be used.
> bool UseNEONForSinglePrecisionFP;
>
> + /// HasBranchTargetBuffer - True if processor can predict
> indirect branches.
> + bool HasBranchTargetBuffer;
> +
> /// IsThumb - True if we are in thumb mode, false if in ARM mode.
> bool IsThumb;
>
> @@ -123,6 +126,8 @@
> bool isThumb2() const { return IsThumb && (ThumbMode == Thumb2); }
> bool hasThumb2() const { return ThumbMode >= Thumb2; }
>
> + bool hasBranchTargetBuffer() const { return
> HasBranchTargetBuffer; }
> +
> bool isR9Reserved() const { return IsR9Reserved; }
>
> const std::string & getCPUString() const { return CPUString; }
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list