[cfe-dev] Posix format strings

Ted Kremenek kremenek at apple.com
Thu Dec 2 13:10:53 PST 2010


Thanks Paul. Comments inline.

On Dec 2, 2010, at 3:30 AM, Paul Curtis wrote:

> Hi all,
> 
> One of the things that currently annoys me, to a certain extent, is that
> Clang can’t be configured to accept Posix-compliant format strings that
> extend the standard C formats.  There are some applications that use Posix
> extensions so it seems only right to at least try to support Posix format
> strings as this is both an ISO and IEEE standard, after all.
> 
> It looks like some of this done, but isn’t correctly attributed.  For
> instance, in ParsePrintfSpecifier there is:
> 
>    // Mac OS X (unicode) specific
>    case 'C': k = ConversionSpecifier::CArg; break;
>    case 'S': k = ConversionSpecifier::SArg; break;


Do you have a handy link to the Posix specification of printf?  That would be helpful, as I only saw %C and %S in the Mac OS X documentation.

> 
> The %C and %S formats are, in fact, defined in Posix, so more accurately
> this could be commented // ISO/IEC 9945, IEEE Std 1003.1 aka POSIX.1.

Makes sense.

> 
> The ‘ (tick) to introduce thousands grouping looks like low-hanging fruit
> and should be easy to implement.

I was unaware of the ` (tick) grouping.

> 
> The %n$ (positional) version is a bit more of a challenge.

Positional arguments are already implemented.

> 
> So, some questions:
> 
> (1) Is anybody other than me supportive of adding this feature to Clang?

Specifically, what are you referring to by "this feature"?

> (2) If so, should Posix format strings be accepted as default?

For these warnings to be useful, I think they should closest match reality for the intended target.  Specifically, if the target supports Posix format strings then I think they should be accepted with no extra effort from the developer.  If Posix format strings are not accepted, then we should issue a warning.

In many ways this is no different than the various flags we have in LangOptions to control the behavior of Sema, particularly when handling different dialects of C and C++ (e.g., c99 versus c89, etc.).

> (3) Should there be a front-end flag to select Posix format support?

I think a low-level -cc1 driver option might be appropriate to control this behavior (just as we do with other features such as "blocks"), but ideally have the logic to decide whether Posix format strings are supported to be put into the high-level driver.  With the low-level driver option, users have the ability to override the default or to explicitly invoke the compiler in a specific configuration.

> I would say that the %C, %D, %@, and %m extensions to ISO C are accepted
> silently by Clang.  That is, there is no -W option to turn on diagnosis of
> these extensions, which feels wrong.

I agree that we should have a separate -cc1 flag, but I do think they should be silently accepted if the target supports them.  Requiring that the user specify an extra command line option all the time to accept them also feels wrong when that is something the compiler can be made aware of in the most common cases.

I'm skeptical that it should be under a separate -W flag.  -W flags are useful for turning warnings on/off simply by silencing them, but the option we are talking about would actually impact how a format string is parsed.  No -W flag actually influences the compiler in this way, but other command line options, e.g -fblocks, actually do change the dialect/semantics of the parsed source file.



More information about the cfe-dev mailing list