[clang] [clang] Constant-evaluate format strings as last resort (PR #135864)
via cfe-commits
cfe-commits at lists.llvm.org
Wed May 7 15:18:43 PDT 2025
=?utf-8?q?Félix?= Cloutier <fcloutier at apple.com>,
=?utf-8?q?Félix?= Cloutier <fcloutier at apple.com>,
=?utf-8?q?Félix?= Cloutier <fcloutier at apple.com>,
=?utf-8?q?Félix?= Cloutier <fcloutier at apple.com>,
=?utf-8?q?Félix?= Cloutier <fcloutier at apple.com>,
=?utf-8?q?Félix?= Cloutier <fcloutier at apple.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/135864 at github.com>
================
@@ -238,3 +246,69 @@ void f(Scoped1 S1, Scoped2 S2) {
}
#endif
+
+#if __cplusplus >= 202000L
+class my_string {
+ char *data;
+ unsigned size;
+
+public:
+ template<unsigned N>
+ constexpr my_string(const char (&literal)[N]) {
+ data = new char[N+1];
+ for (size = 0; size < N; ++size) {
+ data[size] = literal[size];
+ if (data[size] == 0)
+ break;
+ }
+ data[size] = 0;
+ }
+
+ my_string(const my_string &) = delete;
+
+ constexpr my_string(my_string &&that) {
+ data = that.data;
+ size = that.size;
+ that.data = nullptr;
+ that.size = 0;
+ }
+
+ constexpr ~my_string() {
+ delete[] data;
+ }
+
+ template<unsigned N>
+ constexpr void append(const char (&literal)[N]) {
+ char *cat = new char[size + N + 1];
+ char *tmp = cat;
+ for (unsigned i = 0; i < size; ++i) {
+ *tmp++ = data[i];
+ }
+ for (unsigned i = 0; i < N; ++i) {
+ *tmp = literal[i];
+ if (*tmp == 0)
+ break;
+ ++tmp;
+ }
+ *tmp = 0;
+ delete[] data;
+ size = tmp - cat;
+ data = cat;
+ }
+
+ constexpr const char *c_str() const {
+ return data;
+ }
+};
+
+constexpr my_string const_string() {
+ my_string str("hello %s");
+ str.append(", %d");
+ return str;
+}
+
+void test_constexpr_string() {
+ printf(const_string().c_str(), "hello", 123); // no-warning
+ printf(const_string().c_str(), 123, 456); // expected-warning {{format specifies type 'char *' but the argument has type 'int'}}
+}
+#endif
----------------
apple-fcloutier wrote:
I played with this a bit yesterday and the last push improves the situation. When the format string checker follows a DeclRefExpr to a VarDecl that is found to have constant initialization (per `VarDecl::hasConstantInitialization`), we now honor that in the evaluation of the format string. This addresses the reduction that I made of your issue.
This does not address the issue as you reported it (where the function with `[[gnu::format]]` is constexpr/consteval itself) because at the point SemaChecking gets in play, we don't know yet that we are in a constant evaluation context. While parsing this definition:
```c++
constexpr auto myvprintf = getvprintf(const_string());
```
Format checking happens inside `Sema::ActOnCallExpr`, but we only know that `myvprintf` has constant initialization after its initializer has been parsed, and `Sema::isConstantEvaluatedContext()` is `false` at that time, so we don't have the information that we need.
I'm nervous to make changes around this because I don't understand constant evaluation well enough. I think that if we had a way to bail out of the constant evaluation machinery when running into an `if constexpr`, I could take that, but I don't think we have it either.
https://github.com/llvm/llvm-project/pull/135864
More information about the cfe-commits
mailing list