[llvm-commits] some thoughts on lowering for calling conventions

Bob Wilson bob.wilson at apple.com
Tue Sep 14 09:49:05 PDT 2010


This patch and an unrelated comment from Eric got me thinking some more about llvm's handling of calling conventions.  I'm not very happy with our current approach.  The specific issue I'm thinking about now is that we lower either too early or too late.

Lowering in the front-end should be minimized.  It's too early.  Interprocedural analyses and optimizations will suffer. (E.g., when an f64 argument is lowered to a pair of i32 values, it's hard for an analysis to track how that argument is used.) Since we currently support 2 front-ends, it also means that we need to do the front-end lowering in 2 places and keep both of them up to date.  The front-end has to do some lowering in cases where the back-end doesn't have the language-specific information to do the job, but otherwise, I'd like to see the front-end avoid lowering for calling conventions.

Aside from the front-ends, the rest of our calling convention support is handled with selection DAGs.  That is too late.  We'd really like the optimizer to see all the code for splitting up and recombining arguments.  The DAG combiner and machine instruction optimizations clean up some of expanded code, but it's not ideal.  I assume that is the motivation for Rafael's patch here.  Eric is working on fast isel for ARM, and he mentioned to me recently that since fast isel doesn't build selection DAGs, it has to duplicate all the support for lowering calling conventions.

Can we do better?

If we had a target-specific lowering for calling conventions at the llvm IR level, instead of for selection DAGs, then we could run that lowering pass after any high-level interprocedural analyses and optimizations but before things like instcombine.  That would avoid the need for front-ends to "pre-lower" things like this patch, avoid duplicating effort across front-ends and across selection DAGs and fast-isel, preserve information for high-level passes, and hopefully give us better optimization of the code resulting from lowering.

Thoughts?  (I fully expect there to be many obstacles to such a change, but I'm curious to hear if there's any consensus about the right solution for the long term.  I'm not necessarily advocating that we change anything right now.)

On Sep 11, 2010, at 10:37 AM, Rafael Espindola wrote:

> Author: rafael
> Date: Sat Sep 11 12:37:41 2010
> New Revision: 113694
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=113694&view=rev
> Log:
> Lower ARM function signatures a bit more. I implemented only AAPCS since that
> is what I can test. With this patch we convert
> 
> struct foo {double a; int b;};
> void f(int, struct foo);
> void g(int a, double b, int c) {
> struct foo x = {b, c};
> f(a, x);
> }
> 
> into
> ------------------------------------------------------
> define void @g(i32 %a.0, i32, i32 %b.0, i32 %b.1, i32 %c.0) nounwind optsize {
> entry:
> tail call void @f(i32 %a.0, i32 undef, i32 %b.0, i32 %b.1, i32 %c.0,
> i32 0) nounwind
> ret void
> }
> declare void @f(i32, i32, i32, i32, i32, i32)
> ----------------------------------------------------
> 
> instead of
> 
> ---------------------------------------------------
> define void @g(i32 %a, double %b, i32 %c) nounwind optsize {
> entry:
> %tmp11 = bitcast double %b to i64
> %tmp5 = zext i32 %c to i64
> tail call void @f(i32 %a, i64 %tmp11, i64 %tmp5) nounwind
> ret void
> }
> declare void @f(i32, i64, i64)
> --------------------------------------------------
> 
> The IL for
> 
> ------------------------------------------
> double foo(double X, double Y) { return X+Y; }
> -----------------------------------------
> 
> now shows that the arguments being passed in the integer registres:
> 
> ----------------------------------------------------------------------
> define double @foo(i32 %X.0, i32 %X.1, i32 %Y.0, i32 %Y.1) nounwind
> readnone optsize {
> entry:
> %tmp17 = zext i32 %X.0 to i64
> %tmp12 = zext i32 %X.1 to i64
> %tmp13 = shl i64 %tmp12, 32
> %ins15 = or i64 %tmp13, %tmp17
> %tmp6 = zext i32 %Y.0 to i64
> %tmp3 = zext i32 %Y.1 to i64
> %tmp4 = shl i64 %tmp3, 32
> %ins = or i64 %tmp4, %tmp6
> %tmp10 = bitcast i64 %ins15 to double
> %tmp2 = bitcast i64 %ins to double
> %0 = fadd double %tmp10, %tmp2
> ret double %0
> }
> ---------------------------------------------------------------------
> 
> And the produces assembly is the same as before:
> 
> ----------------------------------------------------------------
> foo:
>       vmov    d0, r2, r3
>       vmov    d1, r0, r1
>       vadd.f64        d0, d1, d0
>       vmov    r0, r1, d0
>       bx      lr
> ----------------------------------------------------------------
> 
> Modified:
>    llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm-target.h
>    llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm.cpp
>    llvm-gcc-4.2/trunk/gcc/llvm-abi-default.cpp
>    llvm-gcc-4.2/trunk/gcc/llvm-abi.h
>    llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp
>    llvm-gcc-4.2/trunk/gcc/llvm-types.cpp
> 
> Modified: llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm-target.h
> URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm-target.h?rev=113694&r1=113693&r2=113694&view=diff
> ==============================================================================
> --- llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm-target.h (original)
> +++ llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm-target.h Sat Sep 11 12:37:41 2010
> @@ -55,6 +55,15 @@
> #define LLVM_SHOULD_PASS_AGGREGATE_IN_MIXED_REGS(T, TY, CC, E)    \
>    llvm_arm_should_pass_aggregate_in_mixed_regs((T), (TY), (CC), (E))
> 
> +struct DefaultABIClient;
> +extern bool
> +llvm_arm_try_pass_aggregate_custom(tree, std::vector<const Type*>&,
> +				   CallingConv::ID&,
> +				   struct DefaultABIClient*);
> +
> +#define LLVM_TRY_PASS_AGGREGATE_CUSTOM(T, E, CC, C)	\
> +  llvm_arm_try_pass_aggregate_custom((T), (E), (CC), (C))
> +
> extern
> bool llvm_arm_aggregate_partially_passed_in_regs(std::vector<const Type*>&,
>                                                  std::vector<const Type*>&,
> 
> Modified: llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm.cpp?rev=113694&r1=113693&r2=113694&view=diff
> ==============================================================================
> --- llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm.cpp (original)
> +++ llvm-gcc-4.2/trunk/gcc/config/arm/llvm-arm.cpp Sat Sep 11 12:37:41 2010
> @@ -2557,6 +2557,91 @@
>   }
> }
> 
> +static unsigned count_num_words(std::vector<const Type*> &ScalarElts) {
> +  unsigned NumWords = 0;
> +  for (unsigned i = 0, e = ScalarElts.size(); i != e; ++i) {
> +    const Type *Ty = ScalarElts[i];
> +    if (Ty->isPointerTy()) {
> +      NumWords++;
> +    } else if (Ty->isIntegerTy()) {
> +      const unsigned TypeSize = Ty->getPrimitiveSizeInBits();
> +      const unsigned NumWordsForType = (TypeSize + 31) / 32;
> +
> +      NumWords += NumWordsForType;
> +    } else {
> +      assert (0 && "Unexpected type.");
> +    }
> +  }
> +  return NumWords;
> +}
> +
> +// This function is used only on AAPCS. The difference from the generic
> +// handling of arguments is that arguments larger than 32 bits are split
> +// and padding arguments are added as necessary for alignment. This makes
> +// the IL a bit more explicit about how arguments are handled.
> +extern bool
> +llvm_arm_try_pass_aggregate_custom(tree type,
> +                                   std::vector<const Type*>& ScalarElts,
> +				   CallingConv::ID& CC,
> +				   struct DefaultABIClient* C) {
> +  if (CC != CallingConv::ARM_AAPCS && CC != CallingConv::C)
> +    return false;
> +
> +  if (CC == CallingConv::C && !TARGET_AAPCS_BASED)
> +    return false;
> +
> +  if (TARGET_HARD_FLOAT_ABI)
> +    return false;
> +  const Type *Ty = ConvertType(type);
> +  if (Ty->isPointerTy())
> +    return false;
> +
> +  const unsigned Size = TREE_INT_CST_LOW(TYPE_SIZE(type))/8;
> +  const unsigned Alignment = TYPE_ALIGN(type)/8;
> +  const unsigned NumWords = count_num_words(ScalarElts);
> +  const bool AddPad = Alignment >= 8 && (NumWords % 2);
> +
> +  // First, build a type that will be bitcast to the original one and
> +  // from where elements will be extracted.
> +  std::vector<const Type*> Elts;
> +  const Type* Int32Ty = Type::getInt32Ty(getGlobalContext());
> +  const unsigned NumRegularArgs = Size / 4;
> +  for (unsigned i = 0; i < NumRegularArgs; ++i) {
> +    Elts.push_back(Int32Ty);
> +  }
> +  const unsigned RestSize = Size % 4;
> +  const llvm::Type *RestType = NULL;
> +  if (RestSize> 2) {
> +    RestType = Type::getInt32Ty(getGlobalContext());
> +  } else if (RestSize > 1) {
> +    RestType = Type::getInt16Ty(getGlobalContext());
> +  } else if (RestSize > 0) {
> +    RestType = Type::getInt8Ty(getGlobalContext());
> +  }
> +  if (RestType)
> +    Elts.push_back(RestType);
> +  const StructType *STy = StructType::get(getGlobalContext(), Elts, false);
> +
> +  if (AddPad) {
> +    ScalarElts.push_back(Int32Ty);
> +    C->HandlePad(Int32Ty);
> +  }
> +
> +  for (unsigned i = 0; i < NumRegularArgs; ++i) {
> +    C->EnterField(i, STy);
> +    C->HandleScalarArgument(Int32Ty, 0);
> +    ScalarElts.push_back(Int32Ty);
> +    C->ExitField();
> +  }
> +  if (RestType) {
> +    C->EnterField(NumRegularArgs, STy);
> +    C->HandleScalarArgument(RestType, 0, RestSize);
> +    ScalarElts.push_back(RestType);
> +    C->ExitField();
> +  }
> +  return true;
> +}
> +
> // Target hook for llvm-abi.h. It returns true if an aggregate of the
> // specified type should be passed in a number of registers of mixed types.
> // It also returns a vector of types that correspond to the registers used
> 
> Modified: llvm-gcc-4.2/trunk/gcc/llvm-abi-default.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-abi-default.cpp?rev=113694&r1=113693&r2=113694&view=diff
> ==============================================================================
> --- llvm-gcc-4.2/trunk/gcc/llvm-abi-default.cpp (original)
> +++ llvm-gcc-4.2/trunk/gcc/llvm-abi-default.cpp Sat Sep 11 12:37:41 2010
> @@ -8,7 +8,8 @@
> /// return type. It potentially breaks down the argument and invokes methods
> /// on the client that indicate how its pieces should be handled.  This
> /// handles things like returning structures via hidden parameters.
> -void DefaultABI::HandleReturnType(tree type, tree fn, bool isBuiltin) {
> +void DefaultABI::HandleReturnType(tree type, tree fn, bool isBuiltin,
> +                                  std::vector<const Type*> &ScalarElts) {
>   unsigned Offset = 0;
>   const Type *Ty = ConvertType(type);
>   if (Ty->isVectorTy()) {
> @@ -52,7 +53,9 @@
> 
>     // FIXME: should return the hidden first argument for some targets
>     // (e.g. ELF i386).
> -    C.HandleAggregateShadowResult(Ty->getPointerTo(), false);
> +    const PointerType *PTy = Ty->getPointerTo();
> +    C.HandleAggregateShadowResult(PTy, false);
> +    ScalarElts.push_back(PTy);
>   }
> }
> 
> 
> Modified: llvm-gcc-4.2/trunk/gcc/llvm-abi.h
> URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-abi.h?rev=113694&r1=113693&r2=113694&view=diff
> ==============================================================================
> --- llvm-gcc-4.2/trunk/gcc/llvm-abi.h (original)
> +++ llvm-gcc-4.2/trunk/gcc/llvm-abi.h Sat Sep 11 12:37:41 2010
> @@ -396,7 +396,8 @@
>   /// return type. It potentially breaks down the argument and invokes methods
>   /// on the client that indicate how its pieces should be handled.  This
>   /// handles things like returning structures via hidden parameters.
> -  void HandleReturnType(tree type, tree fn, bool isBuiltin);
> +  void HandleReturnType(tree type, tree fn, bool isBuiltin,
> +                        std::vector<const Type*> &ScalarElts);
> 
>   /// HandleArgument - This is invoked by the target-independent code for each
>   /// argument type passed into the function.  It potentially breaks down the
> 
> Modified: llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp?rev=113694&r1=113693&r2=113694&view=diff
> ==============================================================================
> --- llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp (original)
> +++ llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Sat Sep 11 12:37:41 2010
> @@ -673,17 +673,19 @@
>   FunctionPrologArgumentConversion Client(FnDecl, AI, Builder, CallingConv);
>   DefaultABI ABIConverter(Client);
> 
> +  // Scalar arguments processed so far.
> +  std::vector<const Type*> ScalarArgs;
> +
>   // Handle the DECL_RESULT.
>   ABIConverter.HandleReturnType(TREE_TYPE(TREE_TYPE(FnDecl)), FnDecl,
> -                                DECL_BUILT_IN(FnDecl));
> +                                DECL_BUILT_IN(FnDecl),
> +                                ScalarArgs);
>   // Remember this for use by FinishFunctionBody.
>   ReturnOffset = Client.Offset;
> 
>   // Prepend the static chain (if any) to the list of arguments.
>   tree Args = static_chain ? static_chain : DECL_ARGUMENTS(FnDecl);
> 
> -  // Scalar arguments processed so far.
> -  std::vector<const Type*> ScalarArgs;
>   while (Args) {
>     const char *Name = "unnamed_arg";
>     if (DECL_NAME(Args)) Name = IDENTIFIER_POINTER(DECL_NAME(Args));
> @@ -3012,16 +3014,17 @@
>   DefaultABI ABIConverter(Client);
> 
>   // Handle the result, including struct returns.
> +  std::vector<const Type*> ScalarArgs;
>   ABIConverter.HandleReturnType(TREE_TYPE(exp),
>                                 fndecl ? fndecl : exp,
> -                                fndecl ? DECL_BUILT_IN(fndecl) : false);
> +                                fndecl ? DECL_BUILT_IN(fndecl) : false,
> +                                ScalarArgs);
> 
>   // Pass the static chain, if any, as the first parameter.
>   if (TREE_OPERAND(exp, 2))
>     CallOperands.push_back(Emit(TREE_OPERAND(exp, 2), 0));
> 
>   // Loop over the arguments, expanding them and adding them to the op list.
> -  std::vector<const Type*> ScalarArgs;
>   for (tree arg = TREE_OPERAND(exp, 1); arg; arg = TREE_CHAIN(arg)) {
>     tree type = TREE_TYPE(TREE_VALUE(arg));
>     const Type *ArgTy = ConvertType(type);
> 
> Modified: llvm-gcc-4.2/trunk/gcc/llvm-types.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-types.cpp?rev=113694&r1=113693&r2=113694&view=diff
> ==============================================================================
> --- llvm-gcc-4.2/trunk/gcc/llvm-types.cpp (original)
> +++ llvm-gcc-4.2/trunk/gcc/llvm-types.cpp Sat Sep 11 12:37:41 2010
> @@ -1082,8 +1082,10 @@
>   TARGET_ADJUST_LLVM_CC(CallingConv, type);
> #endif
> 
> +  std::vector<const Type*> ScalarArgs;
>   // Builtins are always prototyped, so this isn't one.
> -  ABIConverter.HandleReturnType(ReturnType, current_function_decl, false);
> +  ABIConverter.HandleReturnType(ReturnType, current_function_decl, false,
> +                                ScalarArgs);
> 
>   SmallVector<AttributeWithIndex, 8> Attrs;
> 
> @@ -1110,7 +1112,6 @@
>     Attrs.push_back(AttributeWithIndex::get(ArgTys.size(),
>                                     Attribute::StructRet));
> 
> -  std::vector<const Type*> ScalarArgs;
>   if (static_chain) {
>     // Pass the static chain as the first parameter.
>     ABIConverter.HandleArgument(TREE_TYPE(static_chain), ScalarArgs);
> @@ -1152,8 +1153,10 @@
>   TARGET_ADJUST_LLVM_CC(CallingConv, type);
> #endif
> 
> +  std::vector<const Type*> ScalarArgs;
>   ABIConverter.HandleReturnType(TREE_TYPE(type), current_function_decl,
> -                                decl ? DECL_BUILT_IN(decl) : false);
> +                                decl ? DECL_BUILT_IN(decl) : false,
> +                                ScalarArgs);
> 
>   // Compute attributes for return type (and function attributes).
>   SmallVector<AttributeWithIndex, 8> Attrs;
> @@ -1218,7 +1221,6 @@
>     Attrs.push_back(AttributeWithIndex::get(ArgTypes.size(),
>                                     Attribute::StructRet | Attribute::NoAlias));
> 
> -  std::vector<const Type*> ScalarArgs;
>   if (static_chain) {
>     // Pass the static chain as the first parameter.
>     ABIConverter.HandleArgument(TREE_TYPE(static_chain), ScalarArgs);
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list