[Mlir-commits] [mlir] [MLIR] Make More Specific Function Header For StringLiteral Optimization in `Diagnostic` (PR #112154)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Sun Oct 13 20:35:00 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-mlir-core

Author: Andrew Luo (AndrewZhaoLuo)

<details>
<summary>Changes</summary>

Diagnostic stores various notes/error messages which might help the user in debugging. For the most part, the `Diagnostic` when receiving an error message will copy and own the contents of the string.

However, there is one optimization where given a `const char*`, the class will assume this is a StringLiteral which is immutable and lifetime matches that of the entire program. As a result, instead of copying the message in these cases the class will simply store the underlying pointer.

This is problematic since `const char*` is not specific enough to always imply a StringLiteral which can lead to bugs, e.g. if the underlying pointer is freed before the diagnostic reports.

We solve this problem by choosing a more specific function signature. While not full-proof, this should cover a lot more cases.

A potentially better alternative is just deleting this special handling of string literals, but I am unsure of the implications (it does sound safe to do however with a negligble impact on performance).

---
Full diff: https://github.com/llvm/llvm-project/pull/112154.diff


3 Files Affected:

- (modified) mlir/include/mlir/IR/Diagnostics.h (+2-1) 
- (modified) mlir/unittests/IR/CMakeLists.txt (+1) 
- (added) mlir/unittests/IR/Diagnostic.cpp (+61) 


``````````diff
diff --git a/mlir/include/mlir/IR/Diagnostics.h b/mlir/include/mlir/IR/Diagnostics.h
index cb30bb3f59688a..8429325412dc97 100644
--- a/mlir/include/mlir/IR/Diagnostics.h
+++ b/mlir/include/mlir/IR/Diagnostics.h
@@ -183,7 +183,8 @@ class Diagnostic {
   Diagnostic &operator<<(StringAttr val);
 
   /// Stream in a string literal.
-  Diagnostic &operator<<(const char *val) {
+  template <size_t n>
+  Diagnostic &operator<<(const char (&val)[n]) {
     arguments.push_back(DiagnosticArgument(val));
     return *this;
   }
diff --git a/mlir/unittests/IR/CMakeLists.txt b/mlir/unittests/IR/CMakeLists.txt
index 547e536dd9cbbf..384116ba5c457e 100644
--- a/mlir/unittests/IR/CMakeLists.txt
+++ b/mlir/unittests/IR/CMakeLists.txt
@@ -4,6 +4,7 @@ add_mlir_unittest(MLIRIRTests
   AffineMapTest.cpp
   AttributeTest.cpp
   AttrTypeReplacerTest.cpp
+  Diagnostic.cpp
   DialectTest.cpp
   InterfaceTest.cpp
   IRMapping.cpp
diff --git a/mlir/unittests/IR/Diagnostic.cpp b/mlir/unittests/IR/Diagnostic.cpp
new file mode 100644
index 00000000000000..dfc83001f173bc
--- /dev/null
+++ b/mlir/unittests/IR/Diagnostic.cpp
@@ -0,0 +1,61 @@
+//===- Diagnostic.cpp - Dialect unit tests -------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "mlir/IR/Diagnostics.h"
+#include "mlir/Support/TypeID.h"
+#include "gtest/gtest.h"
+
+using namespace mlir;
+using namespace mlir::detail;
+
+namespace {
+
+TEST(DiagnosticLifetime, TestCopiesConstCharStar) {
+  const auto *expectedMessage = "Error 1, don't mutate this";
+
+  // Copy expected message into a mutable container, and call the constructor.
+  std::string myStr(expectedMessage);
+
+  mlir::MLIRContext context;
+  Diagnostic diagnostic(mlir::UnknownLoc::get(&context), DiagnosticSeverity::Note);
+  diagnostic << myStr.c_str();
+
+  // Mutate underlying pointer, but ensure diagnostic still has orig. message
+  myStr[0] = '^';
+
+  std::string resultMessage;
+  llvm::raw_string_ostream stringStream(resultMessage);
+  diagnostic.print(stringStream);
+  ASSERT_STREQ(expectedMessage, resultMessage.c_str());
+}
+
+TEST(DiagnosticLifetime, TestLazyCopyStringLiteral) {
+  char charArr[21] = "Error 1, mutate this";
+  mlir::MLIRContext context;
+  Diagnostic diagnostic(mlir::UnknownLoc::get(&context), DiagnosticSeverity::Note);
+
+  // Diagnostic contains optimization which assumes string literals are
+  // represented by `const char[]` type. This is imperfect as we can sometimes
+  // trick the type system as seen below.
+  //
+  // Still we use this to check the diagnostic is lazily storing the pointer.
+  auto addToDiagnosticAsConst = [&diagnostic](const char (&charArr)[21]) {
+    diagnostic << charArr;
+  };
+  addToDiagnosticAsConst(charArr);
+
+  // Mutate the underlying pointer and ensure the string does change
+  charArr[0] = '^';
+
+  std::string resultMessage;
+  llvm::raw_string_ostream stringStream(resultMessage);
+  diagnostic.print(stringStream);
+  ASSERT_STREQ("^rror 1, mutate this", resultMessage.c_str());
+}
+
+} // namespace

``````````

</details>


https://github.com/llvm/llvm-project/pull/112154


More information about the Mlir-commits mailing list