[LLVMdev] Performance regression on ARM

Anton Korobeynikov anton at korobeynikov.info
Sun Oct 19 06:54:58 PDT 2014


I feel like opening a can full of worms here. Here is full story:

1. On ARM all the EABI library functions need to use AAPCS calling
convention even on hard floating point targets
2. __mulsc3, __muldc3 and friends are not defined by EABI, however
both compiler-rt and libgcc emit them with AAPCS, not AAPCS-VFP
calling convention

Right now clang creates calls to __mul?c3 routines, but uses C calling
convention, which on hard FP targets got translated into AAPCS-VFP CC,
which is certainly not what the implementation expects.

The obvious fix is to make sure clang specifies proper calling
convention for such calls. And here we have plenty of funny details.

1. There is CodeGenModule::CreateRuntimeFunction() which is used to
emit calls to runtime functions. Surely, we cannot use it here,
because we would not like to force calling convention on routines like
__cxa_throw, etc. (while they are not using and FP arguments and
technically it's safe for ARM, but...)
2. Even more CodeGenModule::CreateRuntimeFunction() itself is not
enough. The reason is that we need to specify a calling convention on
*both* call and function.  Usually we're using
CodeGenFunction::EmitRuntimeCall() simply because creating a function
requires one to specify LLVM calling convention and
CodeGenFunction::CreateCall() expects clang's calling convention ;)
3. So, for now I decided to separate calling conventions used to emit
calls to runtime functions and compiler builtins.

Some patch which shows this approach (and the whole problem) is attached :)

On Sat, Oct 18, 2014 at 3:27 PM, Anton Korobeynikov
<anton at korobeynikov.info> wrote:
> Hi Chandler,
>
> That's embarrassing how weird this part of clang is. I have a provisional
> patch which fixes the problem but underlines clang's problems. I will submit
> it tonight for comments.
>
> суббота, 18 октября 2014 г. пользователь Chandler Carruth написал:
>
>>
>> On Fri, Oct 17, 2014 at 7:51 AM, Anton Korobeynikov
>> <anton at korobeynikov.info> wrote:
>>>
>>> > Chandler’s complex arithmetic changes are also in the range: r219557 in
>>> > clang.  We saw it change the code in mandel-2 significantly.
>>> mandel-2 is broken on hard FP ABI systems, btw. The reason is simply:
>>> we're emitting a call to __muldc3 with AAPCS VFP calling convention,
>>> however, the function expects softfp (AAPCS) calling conv and reads
>>> garbage from GP registers.
>>>
>>> I'm working on fix.
>>
>>
>> Thanks for looking at this Anton.
>>
>> I don't really know what signal should be used here. Several initial
>> attempts at this didn't work because of ABI and calling convention issues
>> and I thought I had figured out a sound solution by directly forming and
>> calling the library routines using Clang's code generation to handle all of
>> the ABI issues.... but apparently not.
>>
>> If you need to back out the patch temporarily or take any other steps to
>> mitigate, by all means. =/ Sorry for the fallout.
>
>
>
> --
> With best regards, Anton Korobeynikov
> Faculty of Mathematics and Mechanics, Saint Petersburg State University
>



-- 
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University
-------------- next part --------------
diff --git a/lib/CodeGen/ABIInfo.h b/lib/CodeGen/ABIInfo.h
index 9ca5766..b93ca0d 100644
--- a/lib/CodeGen/ABIInfo.h
+++ b/lib/CodeGen/ABIInfo.h
@@ -44,9 +44,12 @@ namespace clang {
     CodeGen::CodeGenTypes &CGT;
   protected:
     llvm::CallingConv::ID RuntimeCC;
+    llvm::CallingConv::ID BuiltinCC;
   public:
     ABIInfo(CodeGen::CodeGenTypes &cgt)
-      : CGT(cgt), RuntimeCC(llvm::CallingConv::C) {}
+      : CGT(cgt),
+        RuntimeCC(llvm::CallingConv::C),
+        BuiltinCC(llvm::CallingConv::C) {}
 
     virtual ~ABIInfo();
 
@@ -62,6 +65,11 @@ namespace clang {
       return RuntimeCC;
     }
 
+    /// Return the calling convention to use for compiler builtins
+    llvm::CallingConv::ID getBuiltinCC() const {
+      return BuiltinCC;
+    }
+
     virtual void computeInfo(CodeGen::CGFunctionInfo &FI) const = 0;
 
     /// EmitVAArg - Emit the target dependent code to load a value of
diff --git a/lib/CodeGen/CGExprComplex.cpp b/lib/CodeGen/CGExprComplex.cpp
index 271c9bc..977e033 100644
--- a/lib/CodeGen/CGExprComplex.cpp
+++ b/lib/CodeGen/CGExprComplex.cpp
@@ -578,13 +578,22 @@ ComplexPairTy ComplexExprEmitter::EmitComplexBinOpLibCall(StringRef LibCallName,
            Op.Ty->castAs<ComplexType>()->getElementType());
 
   // We *must* use the full CG function call building logic here because the
-  // complex type has special ABI handling.
-  const CGFunctionInfo &FuncInfo = CGF.CGM.getTypes().arrangeFreeFunctionCall(
-      Op.Ty, Args, FunctionType::ExtInfo(), RequiredArgs::All);
+  // complex type has special ABI handling. We also should not forget about
+  // special calling convention which may be used for compiler builtins.
+  const CGFunctionInfo &FuncInfo =
+    CGF.CGM.getTypes().arrangeFreeFunctionCall(
+      Op.Ty, Args, FunctionType::ExtInfo(/* No CC here - will be added later */),
+      RequiredArgs::All);
   llvm::FunctionType *FTy = CGF.CGM.getTypes().GetFunctionType(FuncInfo);
-  llvm::Constant *Func = CGF.CGM.CreateRuntimeFunction(FTy, LibCallName);
+  llvm::Constant *Func = CGF.CGM.CreateBuiltinFunction(FTy, LibCallName);
+  llvm::Instruction *Call;
 
-  return CGF.EmitCall(FuncInfo, Func, ReturnValueSlot(), Args).getComplexVal();
+  RValue Res = CGF.EmitCall(FuncInfo, Func, ReturnValueSlot(), Args,
+                            nullptr, &Call);
+  cast<llvm::CallInst>(Call)->setCallingConv(CGF.CGM.getBuiltinCC());
+  cast<llvm::CallInst>(Call)->setDoesNotThrow();
+
+  return Res.getComplexVal();
 }
 
 // See C11 Annex G.5.1 for the semantics of multiplicative operators on complex
diff --git a/lib/CodeGen/CodeGenModule.cpp b/lib/CodeGen/CodeGenModule.cpp
index 5d314f5..8f4f705 100644
--- a/lib/CodeGen/CodeGenModule.cpp
+++ b/lib/CodeGen/CodeGenModule.cpp
@@ -111,6 +111,7 @@ CodeGenModule::CodeGenModule(ASTContext &C, const CodeGenOptions &CGO,
   Int8PtrPtrTy = Int8PtrTy->getPointerTo(0);
 
   RuntimeCC = getTargetCodeGenInfo().getABIInfo().getRuntimeCC();
+  BuiltinCC = getTargetCodeGenInfo().getABIInfo().getBuiltinCC();
 
   if (LangOpts.ObjC1)
     createObjCRuntime();
@@ -1593,6 +1594,21 @@ CodeGenModule::CreateRuntimeFunction(llvm::FunctionType *FTy,
   return C;
 }
 
+/// CreateBuiltinFunction - Create a new builtin function with the specified
+/// type and name.
+llvm::Constant *
+CodeGenModule::CreateBuiltinFunction(llvm::FunctionType *FTy,
+                                     StringRef Name,
+                                     llvm::AttributeSet ExtraAttrs) {
+  llvm::Constant *C =
+      GetOrCreateLLVMFunction(Name, FTy, GlobalDecl(), /*ForVTable=*/false,
+                              /*DontDefer=*/false, ExtraAttrs);
+  if (auto *F = dyn_cast<llvm::Function>(C))
+    if (F->empty())
+      F->setCallingConv(getBuiltinCC());
+  return C;
+}
+
 /// isTypeConstant - Determine whether an object of this type can be emitted
 /// as a constant.
 ///
diff --git a/lib/CodeGen/CodeGenModule.h b/lib/CodeGen/CodeGenModule.h
index 665a7a5..af4552c 100644
--- a/lib/CodeGen/CodeGenModule.h
+++ b/lib/CodeGen/CodeGenModule.h
@@ -150,6 +150,8 @@ struct CodeGenTypeCache {
 
   llvm::CallingConv::ID RuntimeCC;
   llvm::CallingConv::ID getRuntimeCC() const { return RuntimeCC; }
+  llvm::CallingConv::ID BuiltinCC;
+  llvm::CallingConv::ID getBuiltinCC() const { return BuiltinCC; }
 };
 
 struct RREntrypoints {
@@ -875,6 +877,11 @@ public:
                                         StringRef Name,
                                         llvm::AttributeSet ExtraAttrs =
                                           llvm::AttributeSet());
+  /// Create a new compiler builtin function with the specified type and name.
+  llvm::Constant *CreateBuiltinFunction(llvm::FunctionType *Ty,
+                                        StringRef Name,
+                                        llvm::AttributeSet ExtraAttrs =
+                                          llvm::AttributeSet());
   /// Create a new runtime global variable with the specified type and name.
   llvm::Constant *CreateRuntimeVariable(llvm::Type *Ty,
                                         StringRef Name);
diff --git a/lib/CodeGen/TargetInfo.cpp b/lib/CodeGen/TargetInfo.cpp
index f78c2d0..8dd720d 100644
--- a/lib/CodeGen/TargetInfo.cpp
+++ b/lib/CodeGen/TargetInfo.cpp
@@ -4159,7 +4159,7 @@ private:
 public:
   ARMABIInfo(CodeGenTypes &CGT, ABIKind _Kind) : ABIInfo(CGT), Kind(_Kind),
     NumVFPs(16), NumGPRs(4) {
-    setRuntimeCC();
+    setCCs();
     resetAllocatedRegs();
   }
 
@@ -4201,7 +4201,7 @@ private:
 
   llvm::CallingConv::ID getLLVMDefaultCC() const;
   llvm::CallingConv::ID getABIDefaultCC() const;
-  void setRuntimeCC();
+  void setCCs();
 
   void markAllocatedGPRs(unsigned Alignment, unsigned NumRequired) const;
   void markAllocatedVFPs(unsigned Alignment, unsigned NumRequired) const;
@@ -4363,7 +4363,7 @@ llvm::CallingConv::ID ARMABIInfo::getABIDefaultCC() const {
   llvm_unreachable("bad ABI kind");
 }
 
-void ARMABIInfo::setRuntimeCC() {
+void ARMABIInfo::setCCs() {
   assert(getRuntimeCC() == llvm::CallingConv::C);
 
   // Don't muddy up the IR with a ton of explicit annotations if
@@ -4371,6 +4371,9 @@ void ARMABIInfo::setRuntimeCC() {
   llvm::CallingConv::ID abiCC = getABIDefaultCC();
   if (abiCC != getLLVMDefaultCC())
     RuntimeCC = abiCC;
+
+  BuiltinCC = (getABIKind() == APCS ?
+               llvm::CallingConv::ARM_APCS : llvm::CallingConv::ARM_AAPCS);
 }
 
 /// isARMHomogeneousAggregate - Return true if a type is an AAPCS-VFP homogeneous
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.c
Type: text/x-csrc
Size: 356 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141019/6d5aeaa0/attachment.c>


More information about the llvm-dev mailing list