[clang] 5a33859 - [CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting

Fangrui Song via cfe-commits cfe-commits at lists.llvm.org
Thu Oct 15 15:14:49 PDT 2020


Author: Fangrui Song
Date: 2020-10-15T15:14:38-07:00
New Revision: 5a338599fbaa805587227779bde0a9986cb4646d

URL: https://github.com/llvm/llvm-project/commit/5a338599fbaa805587227779bde0a9986cb4646d
DIFF: https://github.com/llvm/llvm-project/commit/5a338599fbaa805587227779bde0a9986cb4646d.diff

LOG: [CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting

rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with
specialized emitting (e.g. memcpy, various math functions) still do not work.

This patch makes these functions work for `asm()` and `#pragma redefine_extname`.
glibc uses `asm()` to redirect internal libc function calls to hidden aliases.

Limitation: such a function is a builtin in clang, but will not be recognized as
a libcall in optimization passes because Clang does not annotate the renamed
function as a libcall.  In GCC -O1 or above, `abs` can be optimized out but we can't.
Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example:

  double sin(double x) asm("real_sin");
  double f(double d) { return __builtin_sin(d); }

---

According to @rsmith, the following three statements cannot be simultaneously true:

(1) The frontend function foo has known, builtin semantics X.
(2) The symbol foo has known, builtin semantics X.
(3) It's not correct to lower a call to the frontend function foo to the symbol foo.

People do want (1) (if it is profitable to expand a memcpy, do it).
This also means that people do not want to add -fno-builtin-memcpy.
People do want (3): that is why they use asm("__GI_memcpy") in the first place.

So unfortunately we make a compromise by not refuting (2) (see the limitation above).
For most libcalls, there is a small loss because compilers don't synthesize them.
For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make
the assembly level redirection.
(Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable).

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D88712

Added: 
    

Modified: 
    clang/lib/CodeGen/CGBuiltin.cpp
    clang/test/CodeGen/asm-label.c
    clang/test/CodeGen/redefine_extname.c

Removed: 
    


################################################################################
diff  --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 884fa1103163..90bfeefc9970 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -1663,13 +1663,20 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
                                                Result.Val.getFloat()));
   }
 
+  // If the builtin has been declared explicitly with an assembler label,
+  // disable the specialized emitting below. Ideally we should communicate the
+  // rename in IR, or at least avoid generating the intrinsic calls that are
+  // likely to get lowered to the renamed library functions.
+  const unsigned BuiltinIDIfNoAsmLabel =
+      FD->hasAttr<AsmLabelAttr>() ? 0 : BuiltinID;
+
   // There are LLVM math intrinsics/instructions corresponding to math library
   // functions except the LLVM op will never set errno while the math library
   // might. Also, math builtins have the same semantics as their math library
   // twins. Thus, we can transform math library and builtin calls to their
   // LLVM counterparts if the call is marked 'const' (known to never set errno).
   if (FD->hasAttr<ConstAttr>()) {
-    switch (BuiltinID) {
+    switch (BuiltinIDIfNoAsmLabel) {
     case Builtin::BIceil:
     case Builtin::BIceilf:
     case Builtin::BIceill:
@@ -1946,7 +1953,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     }
   }
 
-  switch (BuiltinID) {
+  switch (BuiltinIDIfNoAsmLabel) {
   default: break;
   case Builtin::BI__builtin___CFStringMakeConstantString:
   case Builtin::BI__builtin___NSStringMakeConstantString:

diff  --git a/clang/test/CodeGen/asm-label.c b/clang/test/CodeGen/asm-label.c
index 7c7383fa41fc..093f31fc59c4 100644
--- a/clang/test/CodeGen/asm-label.c
+++ b/clang/test/CodeGen/asm-label.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -triple=i686-pc-linux-gnu -emit-llvm %s -o - | FileCheck %s --check-prefix=LINUX
-// RUN: %clang_cc1 -triple=i686-apple-darwin9 -emit-llvm %s -o - | FileCheck %s --check-prefix=DARWIN
+// RUN: %clang_cc1 -triple=i686-pc-linux-gnu -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,LINUX
+// RUN: %clang_cc1 -triple=i686-apple-darwin9 -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,DARWIN
 
 char *strerror(int) asm("alias");
 int x __asm("foo");
@@ -17,3 +17,39 @@ int *test(void) {
 // DARWIN: @"\01bar" = internal global i32 0
 // DARWIN: @"\01foo" = global i32 0
 // DARWIN: declare i8* @"\01alias"(i32)
+
+extern void *memcpy(void *__restrict, const void *__restrict, unsigned long);
+extern __typeof(memcpy) memcpy asm("__GI_memcpy");
+void test_memcpy(void *dst, void *src, unsigned long n) {
+  memcpy(dst, src, n);
+}
+// CHECK-LABEL: @test_memcpy(
+// LINUX:         call i8* @__GI_memcpy(
+// LINUX:       declare i8* @__GI_memcpy(i8*, i8*, i32)
+// DARWIN:        call i8* @"\01__GI_memcpy"(
+// DARWIN:      declare i8* @"\01__GI_memcpy"(i8*, i8*, i32)
+
+long lrint(double x) asm("__GI_lrint");
+long test_lrint(double x) {
+  return lrint(x);
+}
+// CHECK-LABEL: @test_lrint(
+// LINUX:         call i32 @__GI_lrint(
+// LINUX:       declare i32 @__GI_lrint(double)
+// DARWIN:        call i32 @"\01__GI_lrint"(
+// DARWIN:      declare i32 @"\01__GI_lrint"(double)
+
+/// NOTE: GCC can optimize out abs in -O1 or above. Clang does not
+/// communicate the mapping to the backend so the libcall cannot be eliminated.
+int abs(int x) asm("__GI_abs");
+long test_abs(int x) {
+  return abs(x);
+}
+// CHECK-LABEL: @test_abs(
+// LINUX:         call i32 @__GI_abs(
+
+/// FIXME: test_sin should call real_sin instead.
+double sin(double x) asm("real_sin");
+double test_sin(double d) { return __builtin_sin(d); }
+// CHECK-LABEL: @test_sin(
+// LINUX:         call double @llvm.sin.f64(

diff  --git a/clang/test/CodeGen/redefine_extname.c b/clang/test/CodeGen/redefine_extname.c
index d56527a88953..5661df546991 100644
--- a/clang/test/CodeGen/redefine_extname.c
+++ b/clang/test/CodeGen/redefine_extname.c
@@ -30,3 +30,9 @@ static int foo_static() { return 1; }
 int baz() { return foo_static(); }
 // CHECK-NOT: call i32 @bar_static()
 
+// Check that pragma redefine_extname applies to builtin functions.
+typedef unsigned long size_t;
+extern void *memcpy(void *, const void *, size_t);
+#pragma redefine_extname memcpy __GI_memcpy
+void *test_memcpy(void *dst, const void *src, size_t n) { return memcpy(dst, src, n); }
+// CHECK: call i8* @__GI_memcpy(


        


More information about the cfe-commits mailing list