[clang] [clang] implement current direction of CWG2765 for string literal comparisons in constant evaluation (PR #109208)

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 19 11:49:33 PDT 2024


================
@@ -324,6 +324,13 @@ class ASTContext : public RefCountedBase<ASTContext> {
   /// This is lazily created.  This is intentionally not serialized.
   mutable llvm::StringMap<StringLiteral *> StringLiteralCache;
 
+  /// The next string literal "version" to allocate during constant evaluation.
+  /// This is used to distinguish between repeated evaluations of the same
+  /// string literal.
+  ///
+  /// TODO: Ensure version numbers don't collide when deserialized.
----------------
zygoloid wrote:

It's not too hard to contrive a situation where there's an observable misbehavior. For example:

```c++
export module A;
export constexpr const char *f() { return "hello"; }
```
```c++
export module B;
import A;
export constexpr const char *p = f();
```
```c++
export module C;
import A;
export constexpr const char *q = f();
```
```c++
import B;
import C;
// Should be non-constant, but these both have version 0 and
// the same `StringLiteral`, so presumably passes.
static_assert(p == q);
// Might return false at runtime in practice.
bool g() { return p == q; }
```

It's a somewhat low-probability misbehavior at least -- you need the version numbers to collide across modules. (They can also collide with this patch after `unsigned` overflow, but I don't think we need to worry about that.)

Maybe this has too high a risk of collisions still, but we could initialize this counter to a hash of the current module name instead of 0. :)

If we want to do this properly, we could track the greatest version used in each module, and offset the versions in each module by the sum of the versions of earlier-loaded modules (and likewise offset the version for the current file by the sum across all modules). Alternatively we could try to recompute versions at import time, building a map of {module ID, imported version} -> new version. But it's not quite that simple, because given

```c++
export module A;
export constexpr const char *p = "foo";
```
```c++
export module B;
import A;
export constexpr const char *q = p;
```
we need the constant value of `p` imported from A to be the same as the constant value of `q` imported from B.

... So in the former case we would need to do proper remapping not just adding an offset, and in the latter case we'd need to track which module ID the imported version *originally* came from. Both options seem expensive.

Maybe we can think of a better way? (Initializing the counter to a hash is sounding a bit better now, huh?)

https://github.com/llvm/llvm-project/pull/109208


More information about the cfe-commits mailing list