[llvm] 5265adc - [SanitizerBinaryMetadata] Declare callbacks extern weak

Marco Elver via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 24 03:55:40 PST 2023


Author: Marco Elver
Date: 2023-01-24T12:54:20+01:00
New Revision: 5265adc73721963d3bf605a7ad5eab6a7e0850b8

URL: https://github.com/llvm/llvm-project/commit/5265adc73721963d3bf605a7ad5eab6a7e0850b8
DIFF: https://github.com/llvm/llvm-project/commit/5265adc73721963d3bf605a7ad5eab6a7e0850b8.diff

LOG: [SanitizerBinaryMetadata] Declare callbacks extern weak

Declare callbacks extern weak (if no existing declaration exists), and
only call if the function address is non-null.

This allows to attach semantic metadata to binaries where no user of
that metadata exists, avoiding to have to link empty stub callbacks.

Once the binary is linked (statically or dynamically) against a tool
runtime that implements the callbacks, the respective callbacks will be
called. This vastly simplifies gradual deployment of tools using the
metadata, esp. avoiding having to recompile large codebases with
different compiler flags (which negatively impacts compiler caches).

Reviewed By: dvyukov, vitalybuka

Differential Revision: https://reviews.llvm.org/D142408

Added: 
    compiler-rt/test/metadata/nocallback.cpp

Modified: 
    llvm/include/llvm/Transforms/Utils/ModuleUtils.h
    llvm/lib/Transforms/Instrumentation/SanitizerBinaryMetadata.cpp
    llvm/lib/Transforms/Utils/ModuleUtils.cpp
    llvm/test/Instrumentation/SanitizerBinaryMetadata/atomics.ll

Removed: 
    


################################################################################
diff  --git a/compiler-rt/test/metadata/nocallback.cpp b/compiler-rt/test/metadata/nocallback.cpp
new file mode 100644
index 0000000000000..b5ce5f7ac4c9a
--- /dev/null
+++ b/compiler-rt/test/metadata/nocallback.cpp
@@ -0,0 +1,13 @@
+// RUN: %clangxx %s -o %t -fexperimental-sanitize-metadata=all && %t | FileCheck %s
+
+// Test that the compiler emits weak declarations to the callbacks, which are
+// not called if they do not exist.
+
+#include <stdio.h>
+
+int main() {
+  printf("main\n");
+  return 0;
+}
+
+// CHECK: main

diff  --git a/llvm/include/llvm/Transforms/Utils/ModuleUtils.h b/llvm/include/llvm/Transforms/Utils/ModuleUtils.h
index 1a4e842de2950..e37547cb4efff 100644
--- a/llvm/include/llvm/Transforms/Utils/ModuleUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/ModuleUtils.h
@@ -49,7 +49,8 @@ void appendToGlobalDtors(Module &M, Function *F, int Priority,
 void setKCFIType(Module &M, Function &F, StringRef MangledType);
 
 FunctionCallee declareSanitizerInitFunction(Module &M, StringRef InitName,
-                                            ArrayRef<Type *> InitArgTypes);
+                                            ArrayRef<Type *> InitArgTypes,
+                                            bool Weak = false);
 
 /// Creates sanitizer constructor function.
 /// \return Returns pointer to constructor.
@@ -62,7 +63,7 @@ Function *createSanitizerCtor(Module &M, StringRef CtorName);
 std::pair<Function *, FunctionCallee> createSanitizerCtorAndInitFunctions(
     Module &M, StringRef CtorName, StringRef InitName,
     ArrayRef<Type *> InitArgTypes, ArrayRef<Value *> InitArgs,
-    StringRef VersionCheckName = StringRef());
+    StringRef VersionCheckName = StringRef(), bool Weak = false);
 
 /// Creates sanitizer constructor function lazily. If a constructor and init
 /// function already exist, this function returns it. Otherwise it calls \c
@@ -75,7 +76,7 @@ std::pair<Function *, FunctionCallee> getOrCreateSanitizerCtorAndInitFunctions(
     Module &M, StringRef CtorName, StringRef InitName,
     ArrayRef<Type *> InitArgTypes, ArrayRef<Value *> InitArgs,
     function_ref<void(Function *, FunctionCallee)> FunctionsCreatedCallback,
-    StringRef VersionCheckName = StringRef());
+    StringRef VersionCheckName = StringRef(), bool Weak = false);
 
 /// Rename all the anon globals in the module using a hash computed from
 /// the list of public globals in the module.

diff  --git a/llvm/lib/Transforms/Instrumentation/SanitizerBinaryMetadata.cpp b/llvm/lib/Transforms/Instrumentation/SanitizerBinaryMetadata.cpp
index 80eab77dbc568..142b9c38e5fcb 100644
--- a/llvm/lib/Transforms/Instrumentation/SanitizerBinaryMetadata.cpp
+++ b/llvm/lib/Transforms/Instrumentation/SanitizerBinaryMetadata.cpp
@@ -85,6 +85,11 @@ using MetadataInfoSet = SetVector<const MetadataInfo *>;
 
 //===--- Command-line options ---------------------------------------------===//
 
+cl::opt<bool> ClWeakCallbacks(
+    "sanitizer-metadata-weak-callbacks",
+    cl::desc("Declare callbacks extern weak, and only call if non-null."),
+    cl::Hidden, cl::init(true));
+
 cl::opt<bool> ClEmitCovered("sanitizer-metadata-covered",
                             cl::desc("Emit PCs for covered functions."),
                             cl::Hidden, cl::init(false));
@@ -197,15 +202,21 @@ bool SanitizerBinaryMetadata::run() {
         getSectionMarker(getSectionStart(MI->SectionSuffix), Int8PtrTy),
         getSectionMarker(getSectionEnd(MI->SectionSuffix), Int8PtrTy),
     };
+    // We declare the _add and _del functions as weak, and only call them if
+    // there is a valid symbol linked. This allows building binaries with
+    // semantic metadata, but without having callbacks. When a tool that wants
+    // the metadata is linked which provides the callbacks, they will be called.
     Function *Ctor =
         createSanitizerCtorAndInitFunctions(
             Mod, (MI->FunctionPrefix + ".module_ctor").str(),
-            (MI->FunctionPrefix + "_add").str(), InitTypes, InitArgs)
+            (MI->FunctionPrefix + "_add").str(), InitTypes, InitArgs,
+            /*VersionCheckName=*/StringRef(), /*Weak=*/ClWeakCallbacks)
             .first;
     Function *Dtor =
         createSanitizerCtorAndInitFunctions(
             Mod, (MI->FunctionPrefix + ".module_dtor").str(),
-            (MI->FunctionPrefix + "_del").str(), InitTypes, InitArgs)
+            (MI->FunctionPrefix + "_del").str(), InitTypes, InitArgs,
+            /*VersionCheckName=*/StringRef(), /*Weak=*/ClWeakCallbacks)
             .first;
     Constant *CtorData = nullptr;
     Constant *DtorData = nullptr;

diff  --git a/llvm/lib/Transforms/Utils/ModuleUtils.cpp b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
index 5148df55658ba..6d17a466957e4 100644
--- a/llvm/lib/Transforms/Utils/ModuleUtils.cpp
+++ b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
@@ -170,14 +170,17 @@ void llvm::setKCFIType(Module &M, Function &F, StringRef MangledType) {
   }
 }
 
-FunctionCallee
-llvm::declareSanitizerInitFunction(Module &M, StringRef InitName,
-                                   ArrayRef<Type *> InitArgTypes) {
+FunctionCallee llvm::declareSanitizerInitFunction(Module &M, StringRef InitName,
+                                                  ArrayRef<Type *> InitArgTypes,
+                                                  bool Weak) {
   assert(!InitName.empty() && "Expected init function name");
-  return M.getOrInsertFunction(
-      InitName,
-      FunctionType::get(Type::getVoidTy(M.getContext()), InitArgTypes, false),
-      AttributeList());
+  auto *VoidTy = Type::getVoidTy(M.getContext());
+  auto *FnTy = FunctionType::get(VoidTy, InitArgTypes, false);
+  auto FnCallee = M.getOrInsertFunction(InitName, FnTy);
+  auto *Fn = cast<Function>(FnCallee.getCallee());
+  if (Weak && Fn->isDeclaration())
+    Fn->setLinkage(Function::ExternalWeakLinkage);
+  return FnCallee;
 }
 
 Function *llvm::createSanitizerCtor(Module &M, StringRef CtorName) {
@@ -197,14 +200,33 @@ Function *llvm::createSanitizerCtor(Module &M, StringRef CtorName) {
 std::pair<Function *, FunctionCallee> llvm::createSanitizerCtorAndInitFunctions(
     Module &M, StringRef CtorName, StringRef InitName,
     ArrayRef<Type *> InitArgTypes, ArrayRef<Value *> InitArgs,
-    StringRef VersionCheckName) {
+    StringRef VersionCheckName, bool Weak) {
   assert(!InitName.empty() && "Expected init function name");
   assert(InitArgs.size() == InitArgTypes.size() &&
          "Sanitizer's init function expects 
diff erent number of arguments");
   FunctionCallee InitFunction =
-      declareSanitizerInitFunction(M, InitName, InitArgTypes);
+      declareSanitizerInitFunction(M, InitName, InitArgTypes, Weak);
   Function *Ctor = createSanitizerCtor(M, CtorName);
-  IRBuilder<> IRB(Ctor->getEntryBlock().getTerminator());
+  IRBuilder<> IRB(M.getContext());
+
+  BasicBlock *RetBB = &Ctor->getEntryBlock();
+  if (Weak) {
+    RetBB->setName("ret");
+    auto *EntryBB = BasicBlock::Create(M.getContext(), "entry", Ctor, RetBB);
+    auto *CallInitBB =
+        BasicBlock::Create(M.getContext(), "callfunc", Ctor, RetBB);
+    auto *InitFn = cast<Function>(InitFunction.getCallee());
+    auto *InitFnPtr =
+        PointerType::get(InitFn->getType(), InitFn->getAddressSpace());
+    IRB.SetInsertPoint(EntryBB);
+    Value *InitNotNull =
+        IRB.CreateICmpNE(InitFn, ConstantPointerNull::get(InitFnPtr));
+    IRB.CreateCondBr(InitNotNull, CallInitBB, RetBB);
+    IRB.SetInsertPoint(CallInitBB);
+  } else {
+    IRB.SetInsertPoint(RetBB->getTerminator());
+  }
+
   IRB.CreateCall(InitFunction, InitArgs);
   if (!VersionCheckName.empty()) {
     FunctionCallee VersionCheckFunction = M.getOrInsertFunction(
@@ -212,6 +234,10 @@ std::pair<Function *, FunctionCallee> llvm::createSanitizerCtorAndInitFunctions(
         AttributeList());
     IRB.CreateCall(VersionCheckFunction, {});
   }
+
+  if (Weak)
+    IRB.CreateBr(RetBB);
+
   return std::make_pair(Ctor, InitFunction);
 }
 
@@ -220,7 +246,7 @@ llvm::getOrCreateSanitizerCtorAndInitFunctions(
     Module &M, StringRef CtorName, StringRef InitName,
     ArrayRef<Type *> InitArgTypes, ArrayRef<Value *> InitArgs,
     function_ref<void(Function *, FunctionCallee)> FunctionsCreatedCallback,
-    StringRef VersionCheckName) {
+    StringRef VersionCheckName, bool Weak) {
   assert(!CtorName.empty() && "Expected ctor function name");
 
   if (Function *Ctor = M.getFunction(CtorName))
@@ -228,12 +254,13 @@ llvm::getOrCreateSanitizerCtorAndInitFunctions(
     // globals. This will make moving to a concurrent model much easier.
     if (Ctor->arg_empty() ||
         Ctor->getReturnType() == Type::getVoidTy(M.getContext()))
-      return {Ctor, declareSanitizerInitFunction(M, InitName, InitArgTypes)};
+      return {Ctor,
+              declareSanitizerInitFunction(M, InitName, InitArgTypes, Weak)};
 
   Function *Ctor;
   FunctionCallee InitFunction;
   std::tie(Ctor, InitFunction) = llvm::createSanitizerCtorAndInitFunctions(
-      M, CtorName, InitName, InitArgTypes, InitArgs, VersionCheckName);
+      M, CtorName, InitName, InitArgTypes, InitArgs, VersionCheckName, Weak);
   FunctionsCreatedCallback(Ctor, InitFunction);
   return std::make_pair(Ctor, InitFunction);
 }

diff  --git a/llvm/test/Instrumentation/SanitizerBinaryMetadata/atomics.ll b/llvm/test/Instrumentation/SanitizerBinaryMetadata/atomics.ll
index 01e504d3c6c2d..984950e42b32e 100644
--- a/llvm/test/Instrumentation/SanitizerBinaryMetadata/atomics.ll
+++ b/llvm/test/Instrumentation/SanitizerBinaryMetadata/atomics.ll
@@ -2036,16 +2036,40 @@ entry:
 ; Check that callbacks are emitted.
 
 ; CHECK-LABEL: __sanitizer_metadata_atomics.module_ctor
-; CHECK: call void @__sanitizer_metadata_atomics_add(i32 1, ptr @__start_sanmd_atomics, ptr @__stop_sanmd_atomics)
+; CHECK-DAG: entry:
+; CHECK-NEXT:  br i1 icmp ne (ptr @__sanitizer_metadata_atomics_add, ptr null), label %callfunc, label %ret
+; CHECK-DAG: callfunc:
+; CHECK-NEXT:  call void @__sanitizer_metadata_atomics_add(i32 1, ptr @__start_sanmd_atomics, ptr @__stop_sanmd_atomics)
+; CHECK-NEXT:  br label %ret
+; CHECK-DAG: ret:
+; CHECK-NEXT:  ret void
 
 ; CHECK-LABEL: __sanitizer_metadata_atomics.module_dtor
-; CHECK: call void @__sanitizer_metadata_atomics_del(i32 1, ptr @__start_sanmd_atomics, ptr @__stop_sanmd_atomics)
+; CHECK-DAG: entry:
+; CHECK-NEXT:  br i1 icmp ne (ptr @__sanitizer_metadata_atomics_del, ptr null), label %callfunc, label %ret
+; CHECK-DAG: callfunc:
+; CHECK-NEXT:  call void @__sanitizer_metadata_atomics_del(i32 1, ptr @__start_sanmd_atomics, ptr @__stop_sanmd_atomics)
+; CHECK-NEXT:  br label %ret
+; CHECK-DAG: ret:
+; CHECK-NEXT:  ret void
 
 ; CHECK-LABEL: __sanitizer_metadata_covered.module_ctor
-; CHECK: call void @__sanitizer_metadata_covered_add(i32 1, ptr @__start_sanmd_covered, ptr @__stop_sanmd_covered)
+; CHECK-DAG: entry:
+; CHECK-NEXT:  br i1 icmp ne (ptr @__sanitizer_metadata_covered_add, ptr null), label %callfunc, label %ret
+; CHECK-DAG: callfunc:
+; CHECK-NEXT:  call void @__sanitizer_metadata_covered_add(i32 1, ptr @__start_sanmd_covered, ptr @__stop_sanmd_covered)
+; CHECK-NEXT:  br label %ret
+; CHECK-DAG: ret:
+; CHECK-NEXT:  ret void
 
 ; CHECK-LABEL: __sanitizer_metadata_covered.module_dtor
-; CHECK: call void @__sanitizer_metadata_covered_del(i32 1, ptr @__start_sanmd_covered, ptr @__stop_sanmd_covered)
+; CHECK-DAG: entry:
+; CHECK-NEXT:  br i1 icmp ne (ptr @__sanitizer_metadata_covered_del, ptr null), label %callfunc, label %ret
+; CHECK-DAG: callfunc:
+; CHECK-NEXT:  call void @__sanitizer_metadata_covered_del(i32 1, ptr @__start_sanmd_covered, ptr @__stop_sanmd_covered)
+; CHECK-NEXT:  br label %ret
+; CHECK-DAG: ret:
+; CHECK-NEXT:  ret void
 
 ; CHECK: !0 = !{!"sanmd_covered", !1}
 ; CHECK: !1 = !{i32 1}


        


More information about the llvm-commits mailing list