[cfe-dev] [RFC] new format string attributes

Marcus Johnson via cfe-dev cfe-dev at lists.llvm.org
Fri Mar 27 16:06:16 PDT 2020


Hey Arthur,

I really like your idea to just use the regular printf format attribute for all formats.

We can wrap the wprintf attribute as well in the same logic, all we’d have to check is that the format argument is a literal, a string, and that the strings designated type (via u8, u, U, L, etc prefixes) match the actual type expected by printf.

As for changing the length modifiers to U16, U32, etc sure it’s kinda disappointing, but it’s doable.

> On Mar 25, 2020, at 1:53 PM, Arthur O'Dwyer via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> 
>> On Wed, Mar 25, 2020 at 12:59 PM Aaron Ballman via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
>> On Wed, Mar 25, 2020 at 11:51 AM Marcus Johnson <marcusljohnson1991 at gmail.com> wrote:
>> > > That doesn't answer why we need a new format archtype. The archtypes
>> > > are used because we want the check to model behavior specific to some
>> > > API. If I understand your proposal properly, you're not proposing to
>> > > add anything like uprintf() to a C library (and such an API doesn't
>> > > already exist), so adding a new archtype surprises me. I would have
>> > > thought the existing archtypes would suffice, but maybe I'm still
>> > > misunderstanding a part of your proposal.
> 
> In particular, if all you want to do is support `__attribute__((format(printf, x, y)))` on function parameters that happen to be of type `char8_t*`, `char16_t*` or `char32_t*`, that should be trivial.  Just look at how Clang works for arguments of type `wchar_t*` and copy that.
> 
> ...Oh wait, it looks like neither GCC nor Clang actually implement format-string checking for wchar_t format strings!
> https://godbolt.org/z/Tk9YCA
> 
>     std::wprintf("%s", 42);  // no diagnostic emitted
> 
> So that would be a very good place to start, IMO. Once the code is in place to format-check wide string literals, it should be trivial to extend it to also format-check char{8,16,32}_t literals.
> Here's the existing bug report: https://bugs.llvm.org/show_bug.cgi?id=16810
> 
> Orthogonally, you seem to be proposing that there should be some new printf format specifiers besides %s %c %[ (for char) and %ls %lc %l[ (for wchar_t).  This is not a Clang issue; this is a library-design issue that you should think about as you write your library function that takes a format string (you know, the one you want to apply __attribute__((format)) to).  If you are not writing a library function, then you have nothing to apply the attribute to, and therefore there's no reason for you to need anything changed.
> You throw out the ideas of %us for char16_t, %Us for char32_t, and have no suggestion for char8_t. However, you cannot use %us as a format specifier, because printf already gives that sequence a valid meaning:
> 
>     printf("hello %us world", 42u);  // prints "hello 42s world"
> 
> My off-the-top-of-my-head idea is that you should take a hint from MSVC; they provide %I32d, %I64d, etc., for integer types, so how about %C8s, %C16s, and %C32s for Unicode character string types?  However, again, this is an issue to think about as you design your `MyPrintfLikeFunction` within your own codebase. Maybe you'll find that you don't even need a format specifier for those types.
> 
> (FWIW, the C and C++ party line seems to be that no "%C16s" or "%C32s" is needed, because the modern approach is to separate transcoding from output. You shouldn't be printf'ing Unicode strings directly; you should be first transcoding them into `char` or `wchar_t` strings, and then printf'ing or wprintf'ing those strings. Personally I don't think that approach is very helpful in practice, though.)
> 
> –Arthur
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200327/1599f2d8/attachment.html>


More information about the cfe-dev mailing list