[clang] Add support for builtin_verbose_trap (PR #79230)

Akira Hatanaka via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 23 15:54:16 PST 2024


https://github.com/ahatanak created https://github.com/llvm/llvm-project/pull/79230

The builtin causes the program to stop its execution abnormally and shows a human-readable description of the reason for the termination when a debugger is attached or in a symbolicated crash log.

The motivation for the builtin is explained in the following RFC:

https://discourse.llvm.org/t/rfc-adding-builtin-verbose-trap-string-literal/75845

>From 92c9fccde29eec10c775fd86b9cf625987e7929d Mon Sep 17 00:00:00 2001
From: Akira Hatanaka <ahatanak at gmail.com>
Date: Tue, 16 Jan 2024 13:18:09 -0800
Subject: [PATCH] Add support for builtin_verbose_trap

The builtin causes the program to stop its execution abnormally and shows a
human-readable description of the reason for the termination when a debugger is
attached or in a symbolicated crash log.

The motivation for the builtin is explained in the following RFC:

https://discourse.llvm.org/t/rfc-adding-builtin-verbose-trap-string-literal/75845
---
 clang/docs/LanguageExtensions.rst             | 48 ++++++++++++++++++
 clang/docs/ReleaseNotes.rst                   |  2 +
 clang/include/clang/AST/Expr.h                |  5 ++
 clang/include/clang/Basic/Builtins.def        |  1 +
 .../clang/Basic/DiagnosticSemaKinds.td        |  2 +
 clang/lib/AST/ExprConstant.cpp                | 18 +++++--
 clang/lib/CodeGen/CGBuiltin.cpp               | 12 +++++
 clang/lib/CodeGen/CGDebugInfo.cpp             | 42 ++++++++++++++++
 clang/lib/CodeGen/CGDebugInfo.h               | 20 ++++++++
 clang/lib/Sema/SemaChecking.cpp               | 22 +++++++++
 .../CodeGenCXX/debug-info-verbose-trap.cpp    | 49 +++++++++++++++++++
 clang/test/SemaCXX/verbose-trap.cpp           | 28 +++++++++++
 12 files changed, 246 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenCXX/debug-info-verbose-trap.cpp
 create mode 100644 clang/test/SemaCXX/verbose-trap.cpp

diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index c1420079f75118d..4526bc2df53e422 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -3379,6 +3379,54 @@ Query for this feature with ``__has_builtin(__builtin_debugtrap)``.
 
 Query for this feature with ``__has_builtin(__builtin_trap)``.
 
+``__builtin_verbose_trap``
+------------------
+
+``__builtin_verbose_trap`` causes the program to stop its execution abnormally
+and shows a human-readable description of the reason for the termination when a
+debugger is attached or in a symbolicated crash log.
+
+**Syntax**:
+
+.. code-block:: c++
+
+    __builtin_verbose_trap(const char *reason)
+
+**Description**
+
+``__builtin_verbose_trap`` is lowered to the ` ``llvm.trap`` <https://llvm.org/docs/LangRef.html#llvm-trap-intrinsic>`_ builtin.
+Additionally, clang emits debug metadata that represents an artificial inline
+frame whose name encodes the string passed to the builtin, prefixed by a "magic"
+prefix.
+
+For example, consider the following code:
+
+.. code-block:: c++
+
+    void foo(int* p) {
+      if (p == nullptr)
+        __builtin_verbose_trap("Argument_must_not_be_null");
+    }
+
+The debug metadata would look as if it were produced for the following code:
+
+.. code-block:: c++
+
+    __attribute__((always_inline))
+    inline void "__llvm_verbose_trap: Argument_must_not_be_null"() {
+      __builtin_trap();
+    }
+
+    void foo(int* p) {
+      if (p == nullptr)
+        "__llvm_verbose_trap: Argument_must_not_be_null"();
+    }
+
+However, the LLVM IR would not actually contain a call to the artificial
+function — it only exists in the debug metadata.
+
+Query for this feature with ``__has_builtin(__builtin_verbose_trap)``.
+
 ``__builtin_nondeterministic_value``
 ------------------------------------
 
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 6e31849ce16dd40..6da216bbd19250a 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -268,6 +268,8 @@ Non-comprehensive list of changes in this release
   and vector types as return value ``19``, to match GCC 14's behavior.
 * The default value of `_MSC_VER` was raised from 1920 to 1933.
 * Since MSVC 19.33 added undocumented attribute ``[[msvc::constexpr]]``, this release adds the attribute as well.
+* Support for ``__builtin_verbose_trap`` has been added. See
+  https://clang.llvm.org/docs/LanguageExtensions.html#builtin-functions.
 
 * Added ``#pragma clang fp reciprocal``.
 
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index a41f2d66b37b69d..34d913748d3021f 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -774,6 +774,11 @@ class Expr : public ValueStmt {
                                  const Expr *PtrExpression, ASTContext &Ctx,
                                  EvalResult &Status) const;
 
+  /// If the current Expr can be evaluated to a pointer to a null-terminated
+  /// constant string, return the constant string (without the terminating null)
+  /// in Result. Return true if it succeeds.
+  bool tryEvaluateString(std::string &Result, ASTContext &Ctx) const;
+
   /// Enumeration used to describe the kind of Null pointer constant
   /// returned from \c isNullPointerConstant().
   enum NullPointerConstantKind {
diff --git a/clang/include/clang/Basic/Builtins.def b/clang/include/clang/Basic/Builtins.def
index 4dcbaf8a7beaa65..e9bea088941987f 100644
--- a/clang/include/clang/Basic/Builtins.def
+++ b/clang/include/clang/Basic/Builtins.def
@@ -673,6 +673,7 @@ BUILTIN(__builtin_expect_with_probability, "LiLiLid", "ncE")
 BUILTIN(__builtin_prefetch, "vvC*.", "nc")
 BUILTIN(__builtin_readcyclecounter, "ULLi", "n")
 BUILTIN(__builtin_trap, "v", "nr")
+BUILTIN(__builtin_verbose_trap, "vcC*", "nr")
 BUILTIN(__builtin_debugtrap, "v", "n")
 BUILTIN(__builtin_unreachable, "v", "nr")
 BUILTIN(__builtin_shufflevector, "v."   , "nct")
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index c3213f7ccb8c88e..bc01c7876e52012 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -8764,6 +8764,8 @@ def err_expected_callable_argument : Error<
   "expected a callable expression as %ordinal0 argument to %1, found %2">;
 def note_building_builtin_dump_struct_call : Note<
   "in call to printing function with arguments '(%0)' while dumping struct">;
+def err_builtin_verbose_trap_arg : Error<
+  "argument to __builtin_verbose_trap must be a non-empty string literal">;
 
 def err_atomic_load_store_uses_lib : Error<
   "atomic %select{load|store}0 requires runtime support that is not "
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index f20850d14c0c860..0519c604b63a5ad 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -1880,7 +1880,8 @@ static bool EvaluateAtomic(const Expr *E, const LValue *This, APValue &Result,
                            EvalInfo &Info);
 static bool EvaluateAsRValue(EvalInfo &Info, const Expr *E, APValue &Result);
 static bool EvaluateBuiltinStrLen(const Expr *E, uint64_t &Result,
-                                  EvalInfo &Info);
+                                  EvalInfo &Info,
+                                  std::string *StringResult = nullptr);
 
 /// Evaluate an integer or fixed point expression into an APResult.
 static bool EvaluateFixedPointOrInteger(const Expr *E, APFixedPoint &Result,
@@ -16612,7 +16613,7 @@ bool Expr::tryEvaluateObjectSize(uint64_t &Result, ASTContext &Ctx,
 }
 
 static bool EvaluateBuiltinStrLen(const Expr *E, uint64_t &Result,
-                                  EvalInfo &Info) {
+                                  EvalInfo &Info, std::string *StringResult) {
   if (!E->getType()->hasPointerRepresentation() || !E->isPRValue())
     return false;
 
@@ -16639,6 +16640,8 @@ static bool EvaluateBuiltinStrLen(const Expr *E, uint64_t &Result,
         Str = Str.substr(0, Pos);
 
       Result = Str.size();
+      if (StringResult)
+        *StringResult = Str;
       return true;
     }
 
@@ -16654,12 +16657,21 @@ static bool EvaluateBuiltinStrLen(const Expr *E, uint64_t &Result,
     if (!Char.getInt()) {
       Result = Strlen;
       return true;
-    }
+    } else if (StringResult)
+      StringResult->push_back(Char.getInt().getExtValue());
     if (!HandleLValueArrayAdjustment(Info, E, String, CharTy, 1))
       return false;
   }
 }
 
+bool Expr::tryEvaluateString(std::string &StringResult, ASTContext &Ctx) const {
+  Expr::EvalStatus Status;
+  EvalInfo Info(Ctx, Status, EvalInfo::EM_ConstantFold);
+  uint64_t Result;
+
+  return EvaluateBuiltinStrLen(this, Result, Info, &StringResult);
+}
+
 bool Expr::EvaluateCharRangeAsString(std::string &Result,
                                      const Expr *SizeExpression,
                                      const Expr *PtrExpression, ASTContext &Ctx,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 1ed35befe1361f4..27401e6c00647a8 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3212,6 +3212,18 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
   case Builtin::BI__builtin_trap:
     EmitTrapCall(Intrinsic::trap);
     return RValue::get(nullptr);
+  case Builtin::BI__builtin_verbose_trap: {
+    llvm::DILocation *TrapLocation = Builder.getCurrentDebugLocation();
+    if (getDebugInfo()) {
+      std::string Str;
+      E->getArg(0)->tryEvaluateString(Str, getContext());
+      TrapLocation = getDebugInfo()->CreateTrapFailureMessageFor(
+          TrapLocation, "__llvm_verbose_trap", Str);
+    }
+    ApplyDebugLocation ApplyTrapDI(*this, TrapLocation);
+    EmitTrapCall(Intrinsic::trap);
+    return RValue::get(nullptr);
+  }
   case Builtin::BI__debugbreak:
     EmitTrapCall(Intrinsic::debugtrap);
     return RValue::get(nullptr);
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 236d53bee4e8f18..9b939ce560e7bba 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -1628,6 +1628,27 @@ llvm::DIType *CGDebugInfo::createFieldType(
                                    offsetInBits, flags, debugType, Annotations);
 }
 
+llvm::DISubprogram *
+CGDebugInfo::getFakeFuncSubprogram(const std::string &FakeFuncName) {
+  llvm::DISubprogram *&SP = FakeFuncMap[FakeFuncName];
+
+  if (!SP) {
+    auto FileScope = TheCU->getFile();
+    llvm::DISubroutineType *DIFnTy = DBuilder.createSubroutineType(nullptr);
+    // Note: We use `FileScope` rather than `TheCU` as the scope because that's
+    // what LLVM's inliner seems to do.
+    SP = DBuilder.createFunction(
+        /*Scope=*/FileScope, /*Name=*/FakeFuncName, /*LinkageName=*/StringRef(),
+        /*File=*/FileScope, /*LineNo=*/0, /*Ty=*/DIFnTy,
+        /*ScopeLine=*/0,
+        /*Flags=*/llvm::DINode::FlagArtificial,
+        /*SPFlags=*/llvm::DISubprogram::SPFlagDefinition,
+        /*TParams=*/nullptr, /*ThrownTypes=*/nullptr, /*Annotations=*/nullptr);
+  }
+
+  return SP;
+}
+
 void CGDebugInfo::CollectRecordLambdaFields(
     const CXXRecordDecl *CXXDecl, SmallVectorImpl<llvm::Metadata *> &elements,
     llvm::DIType *RecordTy) {
@@ -3416,6 +3437,27 @@ llvm::DIMacroFile *CGDebugInfo::CreateTempMacroFile(llvm::DIMacroFile *Parent,
   return DBuilder.createTempMacroFile(Parent, Line, FName);
 }
 
+llvm::DILocation *CGDebugInfo::CreateTrapFailureMessageFor(
+    llvm::DebugLoc TrapLocation, StringRef Prefix, StringRef FailureMsg) {
+  // Create debug info that describes a fake function whose name is the failure
+  // message.
+  std::string FuncName(Prefix);
+  if (!FailureMsg.empty()) {
+    // A space in the function name identifies this as not being a real function
+    // because it's not a valid symbol name.
+    FuncName += ": ";
+    FuncName += FailureMsg;
+  }
+
+  assert(FuncName.size() > 0);
+  assert(FuncName.find(' ') != std::string::npos &&
+         "Fake function name must contain a space");
+
+  llvm::DISubprogram *TrapSP = getFakeFuncSubprogram(FuncName);
+  return llvm::DILocation::get(CGM.getLLVMContext(), /*Line=*/0, /*Column=*/0,
+                               /*Scope=*/TrapSP, /*InlinedAt=*/TrapLocation);
+}
+
 static QualType UnwrapTypeForDebugInfo(QualType T, const ASTContext &C) {
   Qualifiers Quals;
   do {
diff --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index 7b60e94555d0608..21edb5536742600 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -346,6 +346,14 @@ class CGDebugInfo {
       const FieldDecl *BitFieldDecl, const llvm::DIDerivedType *BitFieldDI,
       llvm::ArrayRef<llvm::Metadata *> PreviousFieldsDI, const RecordDecl *RD);
 
+  // A cache that maps fake function names used for __builtin_verbose_trap to
+  // subprograms.
+  std::map<std::string, llvm::DISubprogram *> FakeFuncMap;
+
+  // A function that returns the subprogram corresponding to the fake function
+  // name.
+  llvm::DISubprogram *getFakeFuncSubprogram(const std::string &FakeFuncName);
+
   /// Helpers for collecting fields of a record.
   /// @{
   void CollectRecordLambdaFields(const CXXRecordDecl *CXXDecl,
@@ -602,6 +610,18 @@ class CGDebugInfo {
     return CoroutineParameterMappings;
   }
 
+  // Create a debug location from `TrapLocation` that adds a fake
+  // inline frame where the frame name is
+  //
+  // * `<Prefix>: <FailureMsg>` if `<FailureMsg>` is not empty.
+  // * `<Prefix>` if `<FailureMsg>` is empty. Note `<Prefix>` must
+  //   contain a space.
+  //
+  // This is used to store failure reasons for traps.
+  llvm::DILocation *CreateTrapFailureMessageFor(llvm::DebugLoc TrapLocation,
+                                                StringRef Prefix,
+                                                StringRef FailureMsg);
+
 private:
   /// Emit call to llvm.dbg.declare for a variable declaration.
   /// Returns a pointer to the DILocalVariable associated with the
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index ace3e386988f005..2a340236c73e8ed 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -171,6 +171,23 @@ static bool checkArgCount(Sema &S, CallExpr *Call, unsigned DesiredArgCount) {
          << /*is non object*/ 0 << Call->getArg(1)->getSourceRange();
 }
 
+static bool checkBuiltinVerboseTrap(CallExpr *Call, Sema &S) {
+  Expr *Arg = Call->getArg(0);
+
+  if (Arg->isValueDependent())
+    return true;
+
+  // FIXME: Add more checks and reject strings that can't be handled by
+  // debuggers.
+  std::string Result;
+  if (!Arg->tryEvaluateString(Result, S.Context) || Result.empty()) {
+    S.Diag(Arg->getBeginLoc(), diag::err_builtin_verbose_trap_arg)
+        << Arg->getSourceRange();
+    return false;
+  }
+  return true;
+}
+
 static bool convertArgumentToType(Sema &S, Expr *&Value, QualType Ty) {
   if (Value->isTypeDependent())
     return false;
@@ -2881,6 +2898,11 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
   case Builtin::BI__builtin_matrix_column_major_store:
     return SemaBuiltinMatrixColumnMajorStore(TheCall, TheCallResult);
 
+  case Builtin::BI__builtin_verbose_trap:
+    if (!checkBuiltinVerboseTrap(TheCall, *this))
+      return ExprError();
+    break;
+
   case Builtin::BI__builtin_get_device_side_mangled_name: {
     auto Check = [](CallExpr *TheCall) {
       if (TheCall->getNumArgs() != 1)
diff --git a/clang/test/CodeGenCXX/debug-info-verbose-trap.cpp b/clang/test/CodeGenCXX/debug-info-verbose-trap.cpp
new file mode 100644
index 000000000000000..3433a08312f274a
--- /dev/null
+++ b/clang/test/CodeGenCXX/debug-info-verbose-trap.cpp
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -std=c++20 -emit-llvm -debug-info-kind=limited %s -o - | FileCheck %s
+
+// CHECK-LABEL: define void @_Z2f0v()
+// CHECK: call void @llvm.trap(), !dbg ![[LOC17:.*]]
+
+// CHECK-LABEL: define void @_Z2f1v()
+// CHECK: call void @llvm.trap(), !dbg ![[LOC23:.*]]
+// CHECK: call void @llvm.trap(), !dbg ![[LOC25:.*]]
+
+// CHECK-LABEL: define void @_Z2f3v()
+// CHECK: call void @_Z2f2IXadsoKcL_ZL8constMsgEEEEvv()
+
+// CHECK-LABEL: define internal void @_Z2f2IXadsoKcL_ZL8constMsgEEEEvv()
+// CHECK: call void @llvm.trap(), !dbg ![[LOC36:.*]]
+
+// CHECK: ![[FILESCOPE:.*]] = !DIFile(filename:
+// CHECK: ![[SUBPROG14:.*]] = distinct !DISubprogram(name: "f0", linkageName: "_Z2f0v",
+// CHECK: ![[LOC17]] = !DILocation(line: 0, scope: ![[SUBPROG18:.*]], inlinedAt: ![[LOC20:.*]])
+// CHECK: ![[SUBPROG18]] = distinct !DISubprogram(name: "__llvm_verbose_trap: Argument_must_not_be_null", scope: ![[FILESCOPE]], file: ![[FILESCOPE]], type: !{{.*}}, flags: DIFlagArtificial, spFlags: DISPFlagDefinition, unit: !{{.*}})
+// CHECK: ![[LOC20]] = !DILocation(line: 34, column: 3, scope: ![[SUBPROG14]])
+// CHECK: ![[SUBPROG22:.*]] = distinct !DISubprogram(name: "f1", linkageName: "_Z2f1v",
+// CHECK: ![[LOC23]] = !DILocation(line: 0, scope: ![[SUBPROG18]], inlinedAt: ![[LOC24:.*]])
+// CHECK: ![[LOC24]] = !DILocation(line: 38, column: 3, scope: ![[SUBPROG22]])
+// CHECK: ![[LOC25]] = !DILocation(line: 0, scope: ![[SUBPROG26:.*]], inlinedAt: ![[LOC27:.*]])
+// CHECK: ![[SUBPROG26]] = distinct !DISubprogram(name: "__llvm_verbose_trap: hello", scope: ![[FILESCOPE]], file: ![[FILESCOPE]], type: !{{.*}}, flags: DIFlagArtificial, spFlags: DISPFlagDefinition, unit: !{{.*}})
+// CHECK: ![[LOC27]] = !DILocation(line: 39, column: 3, scope: ![[SUBPROG22]])
+// CHECK: ![[SUBPROG32:.*]] = distinct !DISubprogram(name: "f2<constMsg>", linkageName: "_Z2f2IXadsoKcL_ZL8constMsgEEEEvv",
+// CHECK: ![[LOC36]] = !DILocation(line: 0, scope: ![[SUBPROG26]], inlinedAt: ![[LOC37:.*]])
+// CHECK: ![[LOC37]] = !DILocation(line: 44, column: 3, scope: ![[SUBPROG32]])
+
+char const constMsg[] = "hello";
+
+void f0() {
+  __builtin_verbose_trap("Argument_must_not_be_null");
+}
+
+void f1() {
+  __builtin_verbose_trap("Argument_must_not_be_null");
+  __builtin_verbose_trap("hello");
+}
+
+template <const char * const str>
+void f2() {
+  __builtin_verbose_trap(str);
+}
+
+void f3() {
+  f2<constMsg>();
+}
diff --git a/clang/test/SemaCXX/verbose-trap.cpp b/clang/test/SemaCXX/verbose-trap.cpp
new file mode 100644
index 000000000000000..7c152325d30815e
--- /dev/null
+++ b/clang/test/SemaCXX/verbose-trap.cpp
@@ -0,0 +1,28 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcxx-exceptions -verify %s
+
+#if !__has_builtin(__builtin_verbose_trap)
+#error
+#endif
+
+constexpr char const* constMsg1 = "hello";
+char const* const constMsg2 = "hello";
+char const constMsg3[] = "hello";
+
+template <const char * const str>
+void f(const char * arg) {
+  __builtin_verbose_trap("Argument_must_not_be_null");
+  __builtin_verbose_trap("hello" "world");
+  __builtin_verbose_trap(constMsg1);
+  __builtin_verbose_trap(constMsg2);
+  __builtin_verbose_trap(""); // expected-error {{argument to __builtin_verbose_trap must be a non-empty string literal}}
+  __builtin_verbose_trap(); // expected-error {{too few arguments}}
+  __builtin_verbose_trap(0); // expected-error {{argument to __builtin_verbose_trap must be a non-empty string literal}}
+  __builtin_verbose_trap(1); // expected-error {{cannot initialize a parameter of type 'const char *' with}}
+  __builtin_verbose_trap(arg); // expected-error {{argument to __builtin_verbose_trap must be a non-empty string literal}}
+  __builtin_verbose_trap(str);
+  __builtin_verbose_trap(u8"hello");
+}
+
+void test() {
+  f<constMsg3>(nullptr);
+}



More information about the cfe-commits mailing list