[llvm] [AsmParser] Implicitly declare intrinsics (PR #78251)

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 16 02:26:58 PST 2024


https://github.com/nikic created https://github.com/llvm/llvm-project/pull/78251

We currently require that all referenced globals have an explicit declaration or definition in the IR. For intrinsics, this requirement is redundant, because they cannot be called indirectly (including "direct" calls with mismatched function type). The function type used in the call directly determines the function type of the intrinsic declaration.

Relax this requirement, and implicitly declare any intrinsics that do not have an explicit declaration. This will remove a common annoyance when writing tests and alive2 proofs.

(I also plan to introduce a mode where declarations for all missing symbols will be automatically added, to make working with incomplete IR easier -- but that will be behind a default-disabled flag.)

>From 128071dbabf44eebf6c44f1b5f91ac8971877860 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Tue, 16 Jan 2024 11:21:04 +0100
Subject: [PATCH] [AsmParser] Implicitly declare intrinsics

We currently require that all referenced globals have an explicit
declaration or definition in the IR. For intrinsics, this
requirement is redundant, because they cannot be called indirectly
(including "direct" calls with mismatched function type). The
function type used in the call directly determines the function type
of the intrinsic declaration.

Relax this requirement, and implicitly declare any intrinsics that
do not have an explicit declaration. This will remove a common
annoyance when writing tests and alive2 proofs.

(I also plan to introduce a mode where declarations for all missing
symbols will be automatically added, to make working with incomplete
IR easier -- but that will be behind a default-disabled flag.)
---
 llvm/lib/AsmParser/LLParser.cpp               | 28 +++++++++++++++++++
 .../implicit-intrinsic-declaration-invalid.ll | 11 ++++++++
 .../implicit-intrinsic-declaration.ll         | 24 ++++++++++++++++
 3 files changed, 63 insertions(+)
 create mode 100644 llvm/test/Assembler/implicit-intrinsic-declaration-invalid.ll
 create mode 100644 llvm/test/Assembler/implicit-intrinsic-declaration.ll

diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index fb9e1ba875e1fa2..7127791fbf2fcca 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -246,6 +246,34 @@ bool LLParser::validateEndOfModule(bool UpgradeDebugInfo) {
                  "use of undefined comdat '$" +
                      ForwardRefComdats.begin()->first + "'");
 
+  // Automatically create declarations for intrinsics. Intrinsics can only be
+  // called directly, so the call function type directly determines the
+  // declaration function type.
+  for (const auto &[Name, Info] : make_early_inc_range(ForwardRefVals)) {
+    if (!StringRef(Name).starts_with("llvm."))
+      continue;
+
+    // Don't do anything if the intrinsic is called with different function
+    // types. This would result in a verifier error anyway.
+    auto GetCommonFunctionType = [](Value *V) -> FunctionType * {
+      FunctionType *FTy = nullptr;
+      for (User *U : V->users()) {
+        auto *CB = dyn_cast<CallBase>(U);
+        if (!CB || (FTy && FTy != CB->getFunctionType()))
+          return nullptr;
+        FTy = CB->getFunctionType();
+      }
+      return FTy;
+    };
+    if (FunctionType *FTy = GetCommonFunctionType(Info.first)) {
+      Function *Fn =
+          Function::Create(FTy, GlobalValue::ExternalLinkage, Name, M);
+      Info.first->replaceAllUsesWith(Fn);
+      Info.first->eraseFromParent();
+      ForwardRefVals.erase(Name);
+    }
+  }
+
   if (!ForwardRefVals.empty())
     return error(ForwardRefVals.begin()->second.second,
                  "use of undefined value '@" + ForwardRefVals.begin()->first +
diff --git a/llvm/test/Assembler/implicit-intrinsic-declaration-invalid.ll b/llvm/test/Assembler/implicit-intrinsic-declaration-invalid.ll
new file mode 100644
index 000000000000000..0cb35b390337a00
--- /dev/null
+++ b/llvm/test/Assembler/implicit-intrinsic-declaration-invalid.ll
@@ -0,0 +1,11 @@
+; RUN: not llvm-as < %s 2>&1 | FileCheck %s
+
+; Check that intrinsics do not get automatically declared if they are used
+; with different function types.
+
+; CHECK: error: use of undefined value '@llvm.umax'
+define void @test() {
+  call i8 @llvm.umax(i8 0, i8 1)
+  call i16 @llvm.umax(i16 0, i16 1)
+  ret void
+}
diff --git a/llvm/test/Assembler/implicit-intrinsic-declaration.ll b/llvm/test/Assembler/implicit-intrinsic-declaration.ll
new file mode 100644
index 000000000000000..ea1324188fba3f7
--- /dev/null
+++ b/llvm/test/Assembler/implicit-intrinsic-declaration.ll
@@ -0,0 +1,24 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S < %s | FileCheck %s
+
+; llvm.umax is intentionally missing the mangling suffix here, to show that
+; this works fine with auto-upgrade.
+define void @test(i8 %x, i8 %y) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], -1
+; CHECK-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; CHECK-NEXT:    [[MAX1:%.*]] = call i8 @llvm.umax.i8(i8 [[X]], i8 [[Y]])
+; CHECK-NEXT:    [[X_EXT:%.*]] = zext i8 [[X]] to i16
+; CHECK-NEXT:    [[Y_EXT:%.*]] = zext i8 [[Y]] to i16
+; CHECK-NEXT:    [[MAX2:%.*]] = call i16 @llvm.umax.i16(i16 [[X_EXT]], i16 [[Y_EXT]])
+; CHECK-NEXT:    ret void
+;
+  %cmp = icmp sgt i8 %x, -1
+  call void @llvm.assume(i1 %cmp)
+  %max1 = call i8 @llvm.umax(i8 %x, i8 %y)
+  %x.ext = zext i8 %x to i16
+  %y.ext = zext i8 %y to i16
+  %max2 = call i16 @llvm.umax.i16(i16 %x.ext, i16 %y.ext)
+  ret void
+}



More information about the llvm-commits mailing list