[PATCH] D112579: Allow non-variadic functions to be attributed with `__attribute__((format))`

FĂ©lix Cloutier via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 29 15:34:53 PDT 2021


fcloutier added a comment.

Thanks for looking, Aaron. You're right that the main utility of the aggregation of format warnings is to extend C's type checking because there is no other good way, or good place, to do it. I have built hundreds of millions of shipping lines of C, C++ and Objective-C, and this change seems like it would be an effective fix in several places where we don't currently have anywhere else to go.

For variadic templates, you're right that at some final instantiation, the compiler will have all the format argument types in hand. What gets lost in piping and substitution is actually the format string that Clang must type-check against. You can see this in action in the LLVM code base, actually: if we turn on -Wformat-nonliteral for files including llvm/include/Support/Format.h, the warning will trigger for users of the `llvm::format` function. This is because the format string is stored in `format_object_base::Fmt` instead of being directly forwarded, and sprinkling `constexpr` in strategic places won't resolve this issue because SemaChecking has its own custom expression evaluator. Swapping it out for the more common ExprConstant stuff is probably not impossible, but it's difficult because SemaChecking supports its own kind of symbolism. For instance, it's OK to use a format string parameter as the format string argument of another function that wants one, and other attributes like `format_arg` can tell you to assume the same specifiers as you get from another expression. So, we could fix this, but it would be more work, and the same purpose is served by allowing fixed arguments to participate in the `format` attribute checking. Verifying that users of the LLVM `llvm::format` function are doing the right thing is a better experience than letting this bubble up from however many levels of template instantiation there is.

Type checking may well actually need to be tested better for the case where variadic argument promotions don't happen, but there's a fairly finite number of ways it could go wrong, and I still think that's the easier way to go.

I don't think that we can easily distinguish between parameters created by variadic templates and fixed parameters from a function without variadic templates. I also think that most people who write functions like llvm::format aggressively do not want to be in charge of type-checking the format string themselves, and would much rather defer to Clang's existing type checker. If Clang allows the format attribute to type-check parameters created by variadic templates, it follows that the path of least resistance is to allow it on functions with fixed arguments.

With that said, there are still use cases for allowing format strings in functions with fixed arguments. (Interestingly, it would not be possible to mix fixed format arguments with variadic format arguments in a C function, so maybe we can prevent that altogether, but it does not remove from the general usefulness of the feature). You'll have to take my word for this, but it's one of the handful of blockers that we have for very broad adoption of -Wformat-nonliteral. The story usually goes like this: there's some function `NSString *foo(NSString *fmt, NSMyFoo *obj1, NSMyBar *obj2)` that needs to do something with `obj1` and `obj2` before it decides whether it actually wants to print them, return something unrelated altogether, throw an exception, etc. General limitations of the C language make it difficult to untangle the print parts from the logic parts for reasons which, to me, align rather closely to compiler capriciousness given how close it is to supporting the feature. This is more common in Objective-C code bases as %@ is a universal format specifier for all object types.

Another way we could approach this problem is to have a new attribute to say that a format argument must have a string that conforms to another constant format string. For instance, I think that `NSString *foo(NSString *fmt, NSMyFoo *obj1, NSMyBar *obj2) __attribute__((format_like(1, @"hello %@ %@!")))`, by the compiler checks that `fmt` has specifiers equivalent to %@ %@, would give us roughly the same safety improvements. It would allow for even more fun things, like passing a format string to a function that will then itself supply arguments to a formatting function.

What do you think?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112579/new/

https://reviews.llvm.org/D112579



More information about the cfe-commits mailing list