[clang] [clang][sema] Add nonnull attribute to builtin format functions (PR #160988)

Nikolas Klauser via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 5 13:55:14 PST 2025


Radovan =?utf-8?q?Božić?= <radovan.bozic at htecgroup.com>,
Radovan =?utf-8?q?Božić?= <radovan.bozic at htecgroup.com>,
Radovan =?utf-8?q?Božić?= <radovan.bozic at htecgroup.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/160988 at github.com>


philnik777 wrote:

> I believe it is intentionally not defined. It's relying on `CHECK_FILE` which is defined as:
> 
> ```
> #ifdef IO_DEBUG
> # define CHECK_FILE(FILE, RET) do {				\
>     if ((FILE) == NULL						\
> 	|| ((FILE)->_flags & _IO_MAGIC_MASK) != _IO_MAGIC)	\
>       {								\
> 	__set_errno (EINVAL);					\
> 	return RET;						\
>       }								\
>   } while (0)
> #else
> # define CHECK_FILE(FILE, RET) do { } while (0)
> #endif
> ```
> 
> so it depends on whether `IO_DEBUG` is defined as to whether you get any check for null: https://github.com/bminor/glibc/blob/26e48102108284d2474f83f5afee56b994c86d54/libio/libioP.h#L974

Ah, I didn't see that part. In that case I don't have any examples.

> > > Again, this is matching GCC's behavior and I think that is reasonable given how bad our codegen is otherwise: https://godbolt.org/z/9zh7KTGvG
> > 
> > 
> > Does this really matter though?
> 
> To me? Not really. To someone who is wondering why our codegen is worse than GCC's? Probably.

Well, I'm not convinced there is a case where this actually matters, given that no implementation I could find annotates their declaration with `[[gnu::nonnull]]` themselves. Anyways, I don't think this is the fundamental contention here.

> > Sure, the CodeGen doesn't look great, but I'd expect by far most format strings are literals.
> 
> I'd like to agree, but there are common counterexamples too. e.g., localization often requires non-literal format strings. And this isn't just about the format strings, it's also about the `FILE *` being passed (neither can be null), and those are never literals.

I'm aware that there are cases where it doesn't work (localization was actually the main case I was thinking about). In these cases I'd argue that the function returning the format string should be marked `[[gnu::returns_nonnull]]` instead, as I said below. I would also agree that it's not always feasible to do such things as well though.

> I think this is already the precedent? [...]

I don't think so. AFAICT the currently added attributes simply add semantics guaranteed by some standard/are fundamental parts of the contract of a function (e.g. `noreturn`), allow removing dead code or some amount of constant folding (e.g. `pure`), or enable diagnostics (e.g. `format`). None of these allow exploitation of UB the library could define I think.

> [...]
> Do you have some usage pattern you're worried about? If we have evidence this will be disruptive, that would be compelling to know about.

I don't. I mostly didn't see any discussion of whether it's a good idea to potentially introduce UB here, which worried me somewhat, especially since I think this patch does introduce a novel idea.


https://github.com/llvm/llvm-project/pull/160988


More information about the cfe-commits mailing list