[llvm-commits] [llvm] r173148 - in /llvm/trunk: include/llvm/Analysis/TargetTransformInfo.h lib/Analysis/CodeMetrics.cpp lib/Analysis/IPA/InlineCost.cpp lib/Analysis/TargetTransformInfo.cpp lib/Transforms/Scalar/TailRecursionElimination.cpp

Fri Feb 21 11:50:53 PST 2014

----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Chandler Carruth" <chandlerc at gmail.com>, llvm-commits at cs.uiuc.edu, "Nadav Rotem" <nrotem at apple.com>, "renato
> golin" <renato.golin at linaro.org>
> Sent: Friday, February 21, 2014 1:47:44 PM
> Subject: Re: [llvm-commits] [llvm] r173148 - in /llvm/trunk: include/llvm/Analysis/TargetTransformInfo.h
> lib/Analysis/CodeMetrics.cpp lib/Analysis/IPA/InlineCost.cpp lib/Analysis/TargetTransformInfo.cpp
> lib/Transforms/Scalar/TailRecursionElimination.cpp
> 
> 
> On Feb 21, 2014, at 11:34 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Chandler Carruth" <chandlerc at gmail.com>
> >> To: llvm-commits at cs.uiuc.edu
> >> Sent: Tuesday, January 22, 2013 5:26:02 AM
> >> Subject: [llvm-commits] [llvm] r173148 - in /llvm/trunk:
> >> include/llvm/Analysis/TargetTransformInfo.h
> >> lib/Analysis/CodeMetrics.cpp lib/Analysis/IPA/InlineCost.cpp
> >> lib/Analysis/TargetTransformInfo.cpp
> >> lib/Transforms/Scalar/TailRecursionElimination.cpp
> >> 
> >> Author: chandlerc
> >> Date: Tue Jan 22 05:26:02 2013
> >> New Revision: 173148
> >> 
> >> URL: http://llvm.org/viewvc/llvm-project?rev=173148&view=rev
> >> Log:
> >> Begin fleshing out an interface in TTI for modelling the costs of
> >> generic function calls and intrinsics. This is somewhat
> >> overlapping
> >> with
> >> an existing intrinsic cost method, but that one seems targetted at
> >> vector intrinsics. I'll merge them or separate their names and use
> >> cases
> >> in a separate commit.
> >> 
> >> This sinks the test of 'callIsSmall' down into TTI where targets
> >> can
> >> control it. The whole thing feels very hack-ish to me though. I've
> >> left
> >> a FIXME comment about the fundamental design problem this
> >> presents.
> >> It
> >> isn't yet clear to me what the users of this function *really*
> >> care
> >> about. I'll have to do more analysis to figure that out. Putting
> >> this
> >> here at least provides it access to proper analysis pass tools and
> >> other
> >> such. It also allows us to more cleanly implement the baseline
> >> cost
> >> interfaces in TTI.
> >> 
> >> With this commit, it is now theoretically possible to simplify
> >> much
> >> of
> >> the inline cost analysis's handling of calls by calling through to
> >> this
> >> interface. That conversion will have to happen in subsequent
> >> commits
> >> as
> >> it requires more extensive restructuring of the inline cost
> >> analysis.
> >> 
> >> The CodeMetrics class is now really only in the business of
> >> running
> >> over
> >> a block of code and aggregating the metrics on that block of code,
> >> with
> >> the actual cost evaluation done entirely in terms of TTI.
> > 
> > Looking at this in detail,
> > 
> > +    NumInsts += TTI.getUserCost(&*II);
> 
> The TTI interface is separated into two sections, a scalar one which
> this calls belongs to and the vectorized section. The calls in the
> vectorized section (getInstructionCost) estimate throughput (or are
> meant to) and the ones in the scalar section (getOperationCost)
> presumably code size.

You're right, thanks! I was just confused :(

 -Hal

> 
>  
> > 
> > I'm not sure this is doing what we want. The issue is that the cost
> > model is designed to be used with the vectorizer, and the 'costs'
> > produced by that model don't really measure instructions, or even
> > uop counts, but relative throughput. This means, for example, that
> > if we can dispatch two integer adds per cycle, and 1
> > floating-point add per cycle, then the floating point add will
> > have twice the cost of the integer add (which probably has a cost
> > of 1). But it is not clear to me at all that this is the
> > appropriate measure for inlining and unrolling (especially for
> > unrolling, for OOO cores with loop dispatch buffers, we almost
> > certainly want uop counts).
> > 
> > So the question is: what to do about this? The straightforward
> > thing seems to be to add some kind of 'cost-type' to all of the
> > TTI cost APIs. There seem to be three potentially relevant kinds
> > of costs:
> > - Scaled throughput
> > - uop counts
> > - instruction counts (or, perhaps instead: in-memory code size)
> > 
> > The advantage of doing this is, for the most part, none of the
> > target independent code would need to care about what kind of cost
> > we needed (because the scalarization logic probably does not
> > care), and just the target parts need to change. But maybe there
> > is a better way. Opinions?
> > 
> > -Hal
> > 
> >> 
> >> Modified:
> >>    llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >>    llvm/trunk/lib/Analysis/CodeMetrics.cpp
> >>    llvm/trunk/lib/Analysis/IPA/InlineCost.cpp
> >>    llvm/trunk/lib/Analysis/TargetTransformInfo.cpp
> >>    llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp
> >> 
> >> Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=173148&r1=173147&r2=173148&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> (original)
> >> +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Tue Jan
> >> 22
> >> 05:26:02 2013
> >> @@ -117,6 +117,41 @@
> >>   virtual unsigned getGEPCost(const Value *Ptr,
> >>                               ArrayRef<const Value *> Operands)
> >>                               const;
> >> 
> >> +  /// \brief Estimate the cost of a function call when lowered.
> >> +  ///
> >> +  /// The contract for this is the same as \c getOperationCost
> >> except that it
> >> +  /// supports an interface that provides extra information
> >> specific
> >> to call
> >> +  /// instructions.
> >> +  ///
> >> +  /// This is the most basic query for estimating call cost: it
> >> only
> >> knows the
> >> +  /// function type and (potentially) the number of arguments at
> >> the
> >> call site.
> >> +  /// The latter is only interesting for varargs function types.
> >> +  virtual unsigned getCallCost(FunctionType *FTy, int NumArgs =
> >> -1)
> >> const;
> >> +
> >> +  /// \brief Estimate the cost of calling a specific function
> >> when
> >> lowered.
> >> +  ///
> >> +  /// This overload adds the ability to reason about the
> >> particular
> >> function
> >> +  /// being called in the event it is a library call with special
> >> lowering.
> >> +  virtual unsigned getCallCost(const Function *F, int NumArgs =
> >> -1)
> >> const;
> >> +
> >> +  /// \brief Estimate the cost of calling a specific function
> >> when
> >> lowered.
> >> +  ///
> >> +  /// This overload allows specifying a set of candidate argument
> >> values.
> >> +  virtual unsigned getCallCost(const Function *F,
> >> +                               ArrayRef<const Value *> Arguments)
> >> const;
> >> +
> >> +  /// \brief Estimate the cost of an intrinsic when lowered.
> >> +  ///
> >> +  /// Mirrors the \c getCallCost method but uses an intrinsic
> >> identifier.
> >> +  virtual unsigned getIntrinsicCost(Intrinsic::ID IID, Type
> >> *RetTy,
> >> +                                    ArrayRef<Type *> ParamTys)
> >> const;
> >> +
> >> +  /// \brief Estimate the cost of an intrinsic when lowered.
> >> +  ///
> >> +  /// Mirrors the \c getCallCost method but uses an intrinsic
> >> identifier.
> >> +  virtual unsigned getIntrinsicCost(Intrinsic::ID IID, Type
> >> *RetTy,
> >> +                                    ArrayRef<const Value *>
> >> Arguments) const;
> >> +
> >>   /// \brief Estimate the cost of a given IR user when lowered.
> >>   ///
> >>   /// This can estimate the cost of either a ConstantExpr or
> >>   Instruction when
> >> @@ -134,6 +169,20 @@
> >>   /// comments for a detailed explanation of the cost values.
> >>   virtual unsigned getUserCost(const User *U) const;
> >> 
> >> +  /// \brief Test whether calls to a function lower to actual
> >> program function
> >> +  /// calls.
> >> +  ///
> >> +  /// The idea is to test whether the program is likely to
> >> require a
> >> 'call'
> >> +  /// instruction or equivalent in order to call the given
> >> function.
> >> +  ///
> >> +  /// FIXME: It's not clear that this is a good or useful query
> >> API.
> >> Client's
> >> +  /// should probably move to simpler cost metrics using the
> >> above.
> >> +  /// Alternatively, we could split the cost interface into
> >> distinct
> >> code-size
> >> +  /// and execution-speed costs. This would allow modelling the
> >> core
> >> of this
> >> +  /// query more accurately as the a call is a single small
> >> instruction, but
> >> +  /// incurs significant execution cost.
> >> +  virtual bool isLoweredToCall(const Function *F) const;
> >> +
> >>   /// @}
> >> 
> >>   /// \name Scalar Target Information
> >> 
> >> Modified: llvm/trunk/lib/Analysis/CodeMetrics.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/CodeMetrics.cpp?rev=173148&r1=173147&r2=173148&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Analysis/CodeMetrics.cpp (original)
> >> +++ llvm/trunk/lib/Analysis/CodeMetrics.cpp Tue Jan 22 05:26:02
> >> 2013
> >> @@ -20,41 +20,6 @@
> >> 
> >> using namespace llvm;
> >> 
> >> -/// callIsSmall - If a call is likely to lower to a single target
> >> instruction,
> >> -/// or is otherwise deemed small return true.
> >> -/// TODO: Perhaps calls like memcpy, strcpy, etc?
> >> -bool llvm::callIsSmall(ImmutableCallSite CS) {
> >> -  if (isa<IntrinsicInst>(CS.getInstruction()))
> >> -    return true;
> >> -
> >> -  const Function *F = CS.getCalledFunction();
> >> -  if (!F) return false;
> >> -
> >> -  if (F->hasLocalLinkage()) return false;
> >> -
> >> -  if (!F->hasName()) return false;
> >> -
> >> -  StringRef Name = F->getName();
> >> -
> >> -  // These will all likely lower to a single selection DAG node.
> >> -  if (Name == "copysign" || Name == "copysignf" || Name ==
> >> "copysignl" ||
> >> -      Name == "fabs" || Name == "fabsf" || Name == "fabsl" ||
> >> -      Name == "sin" || Name == "sinf" || Name == "sinl" ||
> >> -      Name == "cos" || Name == "cosf" || Name == "cosl" ||
> >> -      Name == "sqrt" || Name == "sqrtf" || Name == "sqrtl" )
> >> -    return true;
> >> -
> >> -  // These are all likely to be optimized into something smaller.
> >> -  if (Name == "pow" || Name == "powf" || Name == "powl" ||
> >> -      Name == "exp2" || Name == "exp2l" || Name == "exp2f" ||
> >> -      Name == "floor" || Name == "floorf" || Name == "ceil" ||
> >> -      Name == "round" || Name == "ffs" || Name == "ffsl" ||
> >> -      Name == "abs" || Name == "labs" || Name == "llabs")
> >> -    return true;
> >> -
> >> -  return false;
> >> -}
> >> -
> >> /// analyzeBasicBlock - Fill in the current structure with
> >> information gleaned
> >> /// from the specified block.
> >> void CodeMetrics::analyzeBasicBlock(const BasicBlock *BB,
> >> @@ -63,9 +28,6 @@
> >>   unsigned NumInstsBeforeThisBB = NumInsts;
> >>   for (BasicBlock::const_iterator II = BB->begin(), E = BB->end();
> >>        II != E; ++II) {
> >> -    if (TargetTransformInfo::TCC_Free == TTI.getUserCost(&*II))
> >> -      continue;
> >> -
> >>     // Special handling for calls.
> >>     if (isa<CallInst>(II) || isa<InvokeInst>(II)) {
> >>       ImmutableCallSite CS(cast<Instruction>(II));
> >> @@ -83,12 +45,10 @@
> >>         // for that case.
> >>         if (F == BB->getParent())
> >>           isRecursive = true;
> >> -      }
> >> -
> >> -      if (!callIsSmall(CS)) {
> >> -        // Each argument to a call takes on average one
> >> instruction
> >> to set up.
> >> -        NumInsts += CS.arg_size();
> >> 
> >> +        if (TTI.isLoweredToCall(F))
> >> +          ++NumCalls;
> >> +      } else {
> >>         // We don't want inline asm to count as a call - that
> >>         would
> >>         prevent loop
> >>         // unrolling. The argument setup cost is still real,
> >>         though.
> >>         if (!isa<InlineAsm>(CS.getCalledValue()))
> >> @@ -112,7 +72,7 @@
> >>       if (InvI->hasFnAttr(Attribute::NoDuplicate))
> >>         notDuplicatable = true;
> >> 
> >> -    ++NumInsts;
> >> +    NumInsts += TTI.getUserCost(&*II);
> >>   }
> >> 
> >>   if (isa<ReturnInst>(BB->getTerminator()))
> >> 
> >> Modified: llvm/trunk/lib/Analysis/IPA/InlineCost.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IPA/InlineCost.cpp?rev=173148&r1=173147&r2=173148&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Analysis/IPA/InlineCost.cpp (original)
> >> +++ llvm/trunk/lib/Analysis/IPA/InlineCost.cpp Tue Jan 22 05:26:02
> >> 2013
> >> @@ -736,7 +736,7 @@
> >>       return false;
> >>     }
> >> 
> >> -    if (!callIsSmall(CS)) {
> >> +    if (TTI.isLoweredToCall(F)) {
> >>       // We account for the average 1 instruction per call
> >>       argument
> >>       setup
> >>       // here.
> >>       Cost += CS.arg_size() * InlineConstants::InstrCost;
> >> 
> >> Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=173148&r1=173147&r2=173148&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original)
> >> +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Tue Jan 22
> >> 05:26:02 2013
> >> @@ -14,6 +14,7 @@
> >> #include "llvm/IR/Instruction.h"
> >> #include "llvm/IR/IntrinsicInst.h"
> >> #include "llvm/IR/Instructions.h"
> >> +#include "llvm/Support/CallSite.h"
> >> #include "llvm/Support/ErrorHandling.h"
> >> 
> >> using namespace llvm;
> >> @@ -58,10 +59,39 @@
> >>   return PrevTTI->getGEPCost(Ptr, Operands);
> >> }
> >> 
> >> +unsigned TargetTransformInfo::getCallCost(FunctionType *FTy,
> >> +                                          int NumArgs) const {
> >> +  return PrevTTI->getCallCost(FTy, NumArgs);
> >> +}
> >> +
> >> +unsigned TargetTransformInfo::getCallCost(const Function *F,
> >> +                                          int NumArgs) const {
> >> +  return PrevTTI->getCallCost(F, NumArgs);
> >> +}
> >> +
> >> +unsigned TargetTransformInfo::getCallCost(
> >> +    const Function *F, ArrayRef<const Value *> Arguments) const {
> >> +  return PrevTTI->getCallCost(F, Arguments);
> >> +}
> >> +
> >> +unsigned TargetTransformInfo::getIntrinsicCost(
> >> +    Intrinsic::ID IID, Type *RetTy, ArrayRef<Type *> ParamTys)
> >> const
> >> {
> >> +  return PrevTTI->getIntrinsicCost(IID, RetTy, ParamTys);
> >> +}
> >> +
> >> +unsigned TargetTransformInfo::getIntrinsicCost(
> >> +    Intrinsic::ID IID, Type *RetTy, ArrayRef<const Value *>
> >> Arguments) const {
> >> +  return PrevTTI->getIntrinsicCost(IID, RetTy, Arguments);
> >> +}
> >> +
> >> unsigned TargetTransformInfo::getUserCost(const User *U) const {
> >>   return PrevTTI->getUserCost(U);
> >> }
> >> 
> >> +bool TargetTransformInfo::isLoweredToCall(const Function *F)
> >> const {
> >> +  return PrevTTI->isLoweredToCall(F);
> >> +}
> >> +
> >> bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const {
> >>   return PrevTTI->isLegalAddImmediate(Imm);
> >> }
> >> @@ -179,6 +209,7 @@
> >>   virtual void initializePass() {
> >>     // Note that this subclass is special, and must *not* call
> >>     initializeTTI as
> >>     // it does not chain.
> >> +    TopTTI = this;
> >>     PrevTTI = 0;
> >>     DL = getAnalysisIfAvailable<DataLayout>();
> >>   }
> >> @@ -257,6 +288,84 @@
> >>     return TCC_Free;
> >>   }
> >> 
> >> +  unsigned getCallCost(FunctionType *FTy, int NumArgs = -1) const
> >> {
> >> +    assert(FTy && "FunctionType must be provided to this
> >> routine.");
> >> +
> >> +    // The target-independent implementation just measures the
> >> size
> >> of the
> >> +    // function by approximating that each argument will take on
> >> average one
> >> +    // instruction to prepare.
> >> +
> >> +    if (NumArgs < 0)
> >> +      // Set the argument number to the number of explicit
> >> arguments
> >> in the
> >> +      // function.
> >> +      NumArgs = FTy->getNumParams();
> >> +
> >> +    return TCC_Basic * (NumArgs + 1);
> >> +  }
> >> +
> >> +  unsigned getCallCost(const Function *F, int NumArgs = -1) const
> >> {
> >> +    assert(F && "A concrete function must be provided to this
> >> routine.");
> >> +
> >> +    if (NumArgs < 0)
> >> +      // Set the argument number to the number of explicit
> >> arguments
> >> in the
> >> +      // function.
> >> +      NumArgs = F->arg_size();
> >> +
> >> +    if (Intrinsic::ID IID = (Intrinsic::ID)F->getIntrinsicID()) {
> >> +      FunctionType *FTy = F->getFunctionType();
> >> +      SmallVector<Type *, 8> ParamTys(FTy->param_begin(),
> >> FTy->param_end());
> >> +      return TopTTI->getIntrinsicCost(IID, FTy->getReturnType(),
> >> ParamTys);
> >> +    }
> >> +
> >> +    if (!TopTTI->isLoweredToCall(F))
> >> +      return TCC_Basic; // Give a basic cost if it will be
> >> lowered
> >> directly.
> >> +
> >> +    return TopTTI->getCallCost(F->getFunctionType(), NumArgs);
> >> +  }
> >> +
> >> +  unsigned getCallCost(const Function *F,
> >> +                       ArrayRef<const Value *> Arguments) const {
> >> +    // Simply delegate to generic handling of the call.
> >> +    // FIXME: We should use instsimplify or something else to
> >> catch
> >> calls which
> >> +    // will constant fold with these arguments.
> >> +    return TopTTI->getCallCost(F, Arguments.size());
> >> +  }
> >> +
> >> +  unsigned getIntrinsicCost(Intrinsic::ID IID, Type *RetTy,
> >> +                            ArrayRef<Type *> ParamTys) const {
> >> +    switch (IID) {
> >> +    default:
> >> +      // Intrinsics rarely (if ever) have normal argument setup
> >> constraints.
> >> +      // Model them as having a basic instruction cost.
> >> +      // FIXME: This is wrong for libc intrinsics.
> >> +      return TCC_Basic;
> >> +
> >> +    case Intrinsic::dbg_declare:
> >> +    case Intrinsic::dbg_value:
> >> +    case Intrinsic::invariant_start:
> >> +    case Intrinsic::invariant_end:
> >> +    case Intrinsic::lifetime_start:
> >> +    case Intrinsic::lifetime_end:
> >> +    case Intrinsic::objectsize:
> >> +    case Intrinsic::ptr_annotation:
> >> +    case Intrinsic::var_annotation:
> >> +      // These intrinsics don't actually represent code after
> >> lowering.
> >> +      return TCC_Free;
> >> +    }
> >> +  }
> >> +
> >> +  unsigned getIntrinsicCost(Intrinsic::ID IID, Type *RetTy,
> >> +                            ArrayRef<const Value *> Arguments)
> >> const
> >> {
> >> +    // Delegate to the generic intrinsic handling code. This
> >> mostly
> >> provides an
> >> +    // opportunity for targets to (for example) special case the
> >> cost of
> >> +    // certain intrinsics based on constants used as arguments.
> >> +    SmallVector<Type *, 8> ParamTys;
> >> +    ParamTys.reserve(Arguments.size());
> >> +    for (unsigned Idx = 0, Size = Arguments.size(); Idx != Size;
> >> ++Idx)
> >> +      ParamTys.push_back(Arguments[Idx]->getType());
> >> +    return TopTTI->getIntrinsicCost(IID, RetTy, ParamTys);
> >> +  }
> >> +
> >>   unsigned getUserCost(const User *U) const {
> >>     if (isa<PHINode>(U))
> >>       return TCC_Free; // Model all PHI nodes as free.
> >> @@ -266,25 +375,21 @@
> >>       // folded into their uses via addressing modes.
> >>       return GEP->hasAllConstantIndices() ? TCC_Free : TCC_Basic;
> >> 
> >> -    // If we have a call of an intrinsic we can provide more
> >> detailed analysis
> >> -    // by inspecting the particular intrinsic called.
> >> -    // FIXME: Hoist this out into a getIntrinsicCost routine.
> >> -    if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(U)) {
> >> -      switch (II->getIntrinsicID()) {
> >> -      default:
> >> -        return TCC_Basic;
> >> -      case Intrinsic::dbg_declare:
> >> -      case Intrinsic::dbg_value:
> >> -      case Intrinsic::invariant_start:
> >> -      case Intrinsic::invariant_end:
> >> -      case Intrinsic::lifetime_start:
> >> -      case Intrinsic::lifetime_end:
> >> -      case Intrinsic::objectsize:
> >> -      case Intrinsic::ptr_annotation:
> >> -      case Intrinsic::var_annotation:
> >> -        // These intrinsics don't count as size.
> >> -        return TCC_Free;
> >> +    if (ImmutableCallSite CS = U) {
> >> +      const Function *F = CS.getCalledFunction();
> >> +      if (!F) {
> >> +        // Just use the called value type.
> >> +        Type *FTy =
> >> CS.getCalledValue()->getType()->getPointerElementType();
> >> +        return TopTTI->getCallCost(cast<FunctionType>(FTy),
> >> CS.arg_size());
> >>       }
> >> +
> >> +      SmallVector<const Value *, 8> Arguments;
> >> +      for (ImmutableCallSite::arg_iterator AI = CS.arg_begin(),
> >> +                                           AE = CS.arg_end();
> >> +           AI != AE; ++AI)
> >> +        Arguments.push_back(*AI);
> >> +
> >> +      return TopTTI->getCallCost(F, Arguments);
> >>     }
> >> 
> >>     if (const CastInst *CI = dyn_cast<CastInst>(U)) {
> >> @@ -301,6 +406,37 @@
> >>                                 U->getOperand(0)->getType() : 0);
> >>   }
> >> 
> >> +  bool isLoweredToCall(const Function *F) const {
> >> +    // FIXME: These should almost certainly not be handled here,
> >> and
> >> instead
> >> +    // handled with the help of TLI or the target itself. This
> >> was
> >> largely
> >> +    // ported from existing analysis heuristics here so that such
> >> refactorings
> >> +    // can take place in the future.
> >> +
> >> +    if (F->isIntrinsic())
> >> +      return false;
> >> +
> >> +    if (F->hasLocalLinkage() || !F->hasName())
> >> +      return true;
> >> +
> >> +    StringRef Name = F->getName();
> >> +
> >> +    // These will all likely lower to a single selection DAG
> >> node.
> >> +    if (Name == "copysign" || Name == "copysignf" || Name ==
> >> "copysignl" ||
> >> +        Name == "fabs" || Name == "fabsf" || Name == "fabsl" ||
> >> Name
> >> == "sin" ||
> >> +        Name == "sinf" || Name == "sinl" || Name == "cos" || Name
> >> ==
> >> "cosf" ||
> >> +        Name == "cosl" || Name == "sqrt" || Name == "sqrtf" ||
> >> Name
> >> == "sqrtl")
> >> +      return false;
> >> +
> >> +    // These are all likely to be optimized into something
> >> smaller.
> >> +    if (Name == "pow" || Name == "powf" || Name == "powl" || Name
> >> ==
> >> "exp2" ||
> >> +        Name == "exp2l" || Name == "exp2f" || Name == "floor" ||
> >> Name ==
> >> +        "floorf" || Name == "ceil" || Name == "round" || Name ==
> >> "ffs" ||
> >> +        Name == "ffsl" || Name == "abs" || Name == "labs" || Name
> >> ==
> >> "llabs")
> >> +      return false;
> >> +
> >> +    return true;
> >> +  }
> >> +
> >>   bool isLegalAddImmediate(int64_t Imm) const {
> >>     return false;
> >>   }
> >> 
> >> Modified:
> >> llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp?rev=173148&r1=173147&r2=173148&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp
> >> (original)
> >> +++ llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp
> >> Tue
> >> Jan 22 05:26:02 2013
> >> @@ -58,6 +58,7 @@
> >> #include "llvm/Analysis/InlineCost.h"
> >> #include "llvm/Analysis/InstructionSimplify.h"
> >> #include "llvm/Analysis/Loads.h"
> >> +#include "llvm/Analysis/TargetTransformInfo.h"
> >> #include "llvm/IR/Constants.h"
> >> #include "llvm/IR/DerivedTypes.h"
> >> #include "llvm/IR/Function.h"
> >> @@ -79,11 +80,15 @@
> >> 
> >> namespace {
> >>   struct TailCallElim : public FunctionPass {
> >> +    const TargetTransformInfo *TTI;
> >> +
> >>     static char ID; // Pass identification, replacement for typeid
> >>     TailCallElim() : FunctionPass(ID) {
> >>       initializeTailCallElimPass(*PassRegistry::getPassRegistry());
> >>     }
> >> 
> >> +    virtual void getAnalysisUsage(AnalysisUsage &AU) const;
> >> +
> >>     virtual bool runOnFunction(Function &F);
> >> 
> >>   private:
> >> @@ -109,14 +114,21 @@
> >> }
> >> 
> >> char TailCallElim::ID = 0;
> >> -INITIALIZE_PASS(TailCallElim, "tailcallelim",
> >> -                "Tail Call Elimination", false, false)
> >> +INITIALIZE_PASS_BEGIN(TailCallElim, "tailcallelim",
> >> +                      "Tail Call Elimination", false, false)
> >> +INITIALIZE_AG_DEPENDENCY(TargetTransformInfo)
> >> +INITIALIZE_PASS_END(TailCallElim, "tailcallelim",
> >> +                    "Tail Call Elimination", false, false)
> >> 
> >> // Public interface to the TailCallElimination pass
> >> FunctionPass *llvm::createTailCallEliminationPass() {
> >>   return new TailCallElim();
> >> }
> >> 
> >> +void TailCallElim::getAnalysisUsage(AnalysisUsage &AU) const {
> >> +  AU.addRequired<TargetTransformInfo>();
> >> +}
> >> +
> >> /// AllocaMightEscapeToCalls - Return true if this alloca may be
> >> accessed by
> >> /// callees of this function.  We only do very simple analysis
> >> right
> >> now, this
> >> /// could be expanded in the future to use mod/ref information for
> >> particular
> >> @@ -151,6 +163,7 @@
> >>   // right, so don't even try to convert it...
> >>   if (F.getFunctionType()->isVarArg()) return false;
> >> 
> >> +  TTI = &getAnalysis<TargetTransformInfo>();
> >>   BasicBlock *OldEntry = 0;
> >>   bool TailCallsAreMarkedTail = false;
> >>   SmallVector<PHINode*, 8> ArgumentPHIs;
> >> @@ -391,7 +404,8 @@
> >>   if (BB == &F->getEntryBlock() &&
> >>       FirstNonDbg(BB->front()) == CI &&
> >>       FirstNonDbg(llvm::next(BB->begin())) == TI &&
> >> -      callIsSmall(CI)) {
> >> +      CI->getCalledFunction() &&
> >> +      !TTI->isLoweredToCall(CI->getCalledFunction())) {
> >>     // A single-block function with just a call and a return.
> >>     Check
> >>     that
> >>     // the arguments match.
> >>     CallSite::arg_iterator I = CallSite(CI).arg_begin(),
> >> 
> >> 
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory